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Ascaris  parasitic  intestinal  nematodes  have  been  known  to  express  a protein,  PI-3, 
with  specific  pepsin  inhibitory  properties.  This  study  explored  the  structural  properties 
that  lead  to  the  formation  of  a stable  tertiary  structure,  the  properties  that  determined  the 
inhibitory  specificity,  and  the  mech2inism  of  inhibition.  The  recombinant  PI-3  was 
purified  from  bacteria  using  an  inclusion  body  preparation  of  the  protein,  protein 
refolding,  ammonium  sulfate  precipitation,  and  gel  filtration  chromatography.  This 
inhibitor  was  predicted  to  be  composed  of  a mixture  of  helix  and  strand  elements  from 
secondary  structure  programs  and  circular  dichroism.  The  structural  stability  of  PI-3  was 
measured  with  urea-induced  unfolding  experiments.  The  fluorescence  unfolding  curves 
of  PI-3  in  reduced  conditions  had  a midpoint  of  urea  denaturation  1 .4  M lower  than  in 
oxidized  conditions.  These  data  were  used  to  show  that  the  free  energy  of  protein 
stability  for  PI-3  in  the  oxidized  state  was  10.1  kcal/mol,  a value  3.4  kcal/mol  greater 
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than  for  the  protein  in  reduced  conditions.  The  mechanism  of  PI-3  inhibition  of  pepsin 
was  shown  to  be  a non-covalent  interaction  that  proceeds  without  peptide  bond 
hydrolysis  from  the  purification  of  the  complex  by  gel  filtration  chromatography  and 
separation  of  fractions  by  SDS-PAGE.  Without  a tertiary  structure  of  PI-3,  site-directed 
mutations  were  generated  to  locate  residues  that  interact  with  the  proteinases.  Four  lysine 
residues  were  individually  mutated  to  leucine  and  glutamic  acid  residues.  Though  these 
had  been  implicated  as  likely  reactive  residues,  none  of  these  eight  mutant  PI-3  proteins 
showed  dramatic  changes  in  the  affinity  to  pepsin.  The  greatest  change  was  measured  for 
the  K72E  and  K72L  mutations  that  had  2.5-fold  stronger  and  4.0-fold  weaker  affinity 
compared  to  the  wild-type  protein,  suggesting  the  residue  forms  hydrogen  bonds  but  is 
not  critical  to  the  complex  stabilization.  The  mutated  proteins  appeared  to  follow  similar 
patterns  of  association  to  other  aspartic  proteinases,  generally  decreasing  in  affinity  1 0- 
fold  from  pepsin  to  cathepsin  E to  plasmepsin  II  and  to  cathepsin  D.  Based  on 
differences  of  the  surface  charges  of  aspartic  proteinases,  the  lack  of  positive  charges  on 
pepsin  is  hypothesized  to  contribute  to  PI-3  specificity  for  pepsin. 
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CHAPTER  1 

PROTEINASE  INHIBITION  BY  NATURAL  PROTEINS 


Protein  Interactions 

Protein  Structure 

Enzymes  catalyze  chemical  reactions  that  are  essential  for  life.  The  catalytic 
function  of  an  enzyme  requires  the  correct  conformation  of  the  protein.  The  principal 
determinant  of  the  proper  orientation  of  the  protein  into  an  active  conformation  comes 
from  the  genetic  code.  The  translated  amino  acid  sequence  provides  enough  information 
for  most  proteins  to  fold  into  the  active  or  a near-active  conformation  (Anfinsen,  1973). 
Incorrectly  folded  proteins  are  assisted  in  this  folding  in  vivo  by  heat  shock  proteins,  i.e. 
chaperones.  The  translated  protein  sequence  also  contains  motifs  that  direct  post- 
translational  modification  of  the  protein  through  a variety  of  mechanisms: 
oligomerization,  glycosylation,  proteolytic  processing,  and  other  covalent  modifications 
and  cofactor  associations. 

The  amino  acid  sequence  of  the  protein  determines  how  the  protein  will  fold  and 
become  stabilized  in  the  tertiary  structure.  Forces  contributing  to  protein  stability  are 
hydrophobic  effects,  hydrogen  bonding,  electrostatic  interactions,  and  van  der  Waals' 
contacts.  Recent  reviews  have  suggested  that  the  hydrophobic  effect  and  hydrogen 
bonding  have  the  greatest  influence  on  protein  conformational  stability  (Lins  and 
Brasseur,  1995,  Pace  et  al.,  1996).  Protein  stability  is  primarily  determined  by  the 
combination  of  destabilizing  forces  of  decreased  conformational  entropy  and  the  burial  of 
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potential  hydrogen  bonding  groups  and  the  stabilizing  forces  of  hydrophobic  group  burial 
and  hydrogen  bonding  within  the  protein  core. 

An  understanding  of  the  correlation  between  amino  acid  sequences  and  structures  of 
fully  folded  proteins  along  with  the  contributions  of  forces  that  stabilize  a protein  have 
been  used  to  make  predictions  about  protein  structures.  The  database  of  known  protein 
structures  continues  to  grow,  and  predicting  protein  structures  from  amino  acid  sequences 
has  met  with  limited  success.  Prediction  models  for  secondary  structural  motifs  have 
been  moderately  successful  with  many  available  algorithms  that  achieve  60-75% 
accuracy  compared  to  solved  protein  structures  (Rost  and  Sander,  1 996,  Cuff  and  Barton, 
1999).  However,  predicting  the  three-dimensional  structure  from  an  amino  acid  sequence 
with  no  homologue  in  the  Protein  Data  Bank  is  still  unattainable.  Protein  structures  with 
known  homologues  can  be  reasonably  well  modeled  based  on  the  degree  of  sequence 
alignment  (Rost  and  Sander,  1996).  While  predicted  models  of  protein  structures  may  be 
fairly  accurate,  experimentally  defined  structures  will  still  be  required  for  precise 
knowledge  concerning  the  three-dimensional  arrangement  of  amino  acids.  The  tertiary 
structures  of  proteins  and  the  forces  that  contribute  to  the  protein  stable  structures  can  be 
determined,  though  these  analyses  require  great  efforts  in  time  and  resources.  With  the 
human  genome  project  expanding  the  number  of  discovered  genes,  and  subsequent 
protein  amino  acid  sequences,  the  discovery  of  the  structures  and  functions  of  these 
myriad  proteins  has  become  a challenge.  Due  to  the  human  genome  project,  novel  or 
improved  protein  analysis  tools  are  needed  for  early  predictions  and  complete 
determinations  of  protein  structures  and  functions  for  these  new  genes  in  the  arising  field 
of  proteomics. 
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Protein-Protein  Interactions 

Enzyme  catalysis  of  reactions  have  often  been  used  to  establish  functional 
properties  of  a protein  including  predictions  of  the  chemical  mechanism.  For  some 
proteins,  enzyme  catalysis  involves  protein-protein  interactions  or  is  regulated  by 
contacts  with  other  proteins.  Structures  of  proteins  and  of  complexes  of  proteins  can  be 
used  to  study  the  thermodynamic  forces  underlying  protein-complex  formation.  Forces 
that  contribute  to  protein  interactions  are  similar  to  the  forces  that  influence 
intramolecular  stability.  At  interfaces  between  monomeric  surfaces  of  homodimer 
proteins,  correlations  exist  between  hydrophobicity  and  stability  and  the  loss  of 
accessible  surface  area  and  stability  upon  oligomer  formation  (Greenfield  and  Hitchcock- 
DeGregori,  1 995).  Heterocomplexes  such  as  antibody-antigen  or  proteinase-inhibitors 
interactions  tend  to  show  similar  trends  in  changes  in  accessible  surface  area  with 
stability  but  are  less  dependent  on  the  hydrophobic  interactions  (Jones  and  Thornton, 

1 996).  Heterocomplexes  also  tend  to  have  more  hydrogen  bonding  at  the  interface  than 
homodimers,  and  this  has  been  attributed  to  the  requirement  that  the  individual  proteins 
need  to  be  conformationally  stable  without  large  hydrophobic  surface  patches. 

Mutational  analyses  of  amino  acid  residues  at  the  protein  interface  determine  the  factors 
that  dictate  the  specificity  of  one  protein  for  another  and  that  stabilize  the  physical 
interaction. 

From  the  knowledge  gained  on  protein-protein  interfaces,  predictions  about  the 
interactions  of  proteins  that  have  not  been  studied  may  become  useful  in  developing 
hypotheses  about  novel  protein  mechanisms  of  interaction.  Based  on  sequence 
similarities,  assumptions  about  the  function  of  a novel  protein  may  be  predicted. 
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Prediction  methods  for  protein-protein  interactions,  or  docking  studies,  continue  to  be 
developed  based  on  the  criteria  often  learned  from  structural  analyses  regarding 
hydrophobicity,  surface  area  exposure,  hydrogen  bonding,  and  side  chain  conformational 
energies  (Jackson  and  Sternberg,  1995).  However,  advancements  to  models  have  come 
from  the  inclusion  of  electrostatic  interactions  to  improve  the  docking  predictions  (Gabb 
et  al.,  1997).  Even  though  docking  algorithms  are  being  developed,  identifying  physical 
contacts  between  proteins  for  which  no  homologues  exist  requires  extensive  analyses  and 
is  not  easily  predicted. 

One  of  the  common  models  for  studying  the  physical  properties  of  intermolecular 
interactions  is  with  protein-inhibitor  systems.  These  protein-protein  interactions  are 
readily  studied  because  the  dissociation  constants  of  these  complexes  are  in  the 
nanomolar  concentration  range,  and  the  interactions  of  stable  complexes  can  be 
evaluated.  With  an  ever-growing  population  of  known  structures  of  complexes  between 
proteinases  and  inhibitors,  these  have  been  used  as  models  for  protein  docking  studies 
and  structure  prediction  methods. 

Proteinase-Inhibitor  Interactions 

Proteinases 

Proteinases,  endopeptidases,  or  proteases  are  all  terms  used  to  describe  enzymes 
that  catalyze  the  hydrolysis  of  peptide  bonds,  and  the  three  terms  are  commonly 
interchanged.  Four  major  classes  of  proteinases  have  been  defined  based  on 
distinguishing  components  of  the  chemical  mechanisms:  serine,  cysteine,  metallo,  and 
aspartic  proteinases.  These  protein  classes  have  been  organized  into  sub-classes,  known 
as  clans  (Barrett  et  al.,  1998).  Some  clans  are  rather  different  from  the  majority  of  the 
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proteins  in  a class  such  as  the  proteasome  threonine  proteinase,  a clan  of  serine 
proteinases,  and  the  pepstatin-insensitive  carboxyl  proteinases,  a clan  of  the  aspartic 
proteinases.  Between  classes  and  within  any  class  of  these  enzymes,  peptide  bond 
hydrolysis  occurs  under  a wide  variety  of  conditions  and  environments.  Likewise,  the 
substrate  specificity  of  proteolysis  widely  varies  between  proteinase  classes  and  within 
any  class,  and  the  specificity  is  based  on  the  composition  of  active  site  amino  acid 
residues  of  these  proteins.  The  topologies  of  the  active  sites  of  proteinases  contain 
pockets,  or  subsites,  into  which  substrate  side  chains  form  contacts,  typically  in  an 
extended  P-strand  conformation.  The  nomenclature  for  proteinase  subsites  and  the 
substrate  positions  are  diagrammed  in  Figure  1-1  and  based  on  the  method  proposed  by 
Schechter  and  Berger  (1967). 

Proteinases  are  naturally  regulated  to  prevent  random  damaging  proteolysis  to 
cellular  proteins  and  extracellular  tissues  and  protein  activation  of  cascades,  such  as 
blood  clotting.  Proteinases  are  regulated  by  varieties  of  mechanisms:  rates  of 
transcription  and  translation,  sub-cellular  targeting,  zymogen  activation,  inactivation  by 
inhibitors,  and  eventual  protein  degradation.  Most  proteinases  are  regulated  by  a 
prosegment  when  these  proteins  are  expressed  as  zymogens.  These  zymogens  remain 
stable  until  specific  conditions  are  met:  for  some  proteinases  the  acidification  of  the 
environment  can  lead  to  autoactivation,  while  other  proteins  require  activating  enzymes 
to  hydrolyze  the  prosegment  from  the  active  enzyme.  Many  other  proteinases  are 
regulated  by  the  presence  of  specific  proteinaceous  inhibitors.  The  inhibitors 
antithrombin  III  with  heparin  cofactor  and  protease  nexin  1 regulate  the  activity 
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Figure  1-1 . Diagram  Based  on  the  Sehechter  and  Berger  Nomenelature  for  Proteinase 

Subsites  (S)  and  peptide  positions  (P)  are  related  to  the  position  of  the  peptide  substrate 
in  the  proteinase  active  site,  and  the  peptide  scissile  bond  is  depicted  as  the  line  broken 
by  an  asterix  (*). 
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of  proteinases  in  the  blood  coagulation  pathway,  while  tissue  inhibitors  of 
metalloproteinases  are  known  to  regulate  the  remodeling  of  extracellular  matrix  proteins. 

Pathologies  due  to  faulty  proteinase  regulation  are  becoming  better  understood. 
Unregulated  thrombin  and  Factor  Xa  may  lead  to  abnormal  blood  clotting,  and  the 
mutations  to  antithrombin  III  is  sometimes  a factor  in  thromboembolism  (Pabinger  and 
Schneider,  1 996).  Calpain,  a cysteine  proteinase,  has  been  identified  in  patient  serum  as 
a factor  contributing  to  rheumatoid  arthritis  (Menard  and  El- Amine,  1 996).  Matrix 
metalloproteinases  have  been  implicated  in  cancer  invasion  (Henriet  et  al.,  1999). 
Cathepsin  D has  been  implicated  in  tumor  progression  and  amyloid  plaque  formations 
(Ladror  et  al.,  1994,  Higaki  et  al.,  1996). 

As  a modulator  of  proteinase  function,  proteinase  inhibitors  have  been  studied  in 
great  detail  as  to  their  regulatory  roles  and  as  possible  drug  candidates.  Aprotinin  or 
pancreatic  trypsin  inhibitor  has  been  a therapeutic  for  shock  syndromes,  hyperfibrinolytic 
hemorrhage,  and  acute  pancreatitis  under  the  trade  name  Trasylol  (Fritz  and  Wunderer, 
1983).  Nearly  all  proteinase  inhibitors  isolated  from  eukaryotes  have  been  proteins  of 
widely  ranging  sizes  from  50  to  450  amino  acids  residues.  Bacterial  sources  do, 
however,  produce  peptide  inhibitors,  such  as  the  general  aspartic  proteinase  inhibitor 
from  the  actinomycetes,  pepstatin.  Nearly  all  proteinase  inhibitors  that  have  been 
crystallized  in  a complex  with  an  associated  protease  have  been  shown  to  block  the  active 
site  of  the  protease  (Bode  et  al.,  1999).  Different  mechanisms  may  be  employed  in  the 
physical  interaction  of  the  inhibitor  with  the  protease.  In  described  mechanisms,  multiple 
contacts  are  made  between  the  inhibitor  and  the  teirget  protease;  however,  the  different 
types  of  contact  seem  to  impart  different  functions.  Some  contacts  coordinate  the 
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specificity  and  docking  to  the  target,  and  others  are  more  important  for  the  strength  of  the 
binding  affinity. 

The  serine  proteinase  inhibitors  have  been  the  most  thoroughly  studied  of  all 
protease  inhibitor  classes,  and  the  mechanisms  of  the  Kunitz  and  serpin  inhibitor  families 
will  be  described  along  with  some  variations  to  these  common  mechanisms.  Other 
classes  of  protease  inhibitors  will  be  described,  including  cysteine  proteinase  inhibitors 
and  inhibitors  of  metalloproteinases.  Even  though  natural  inhibitors  of  aspartic 
proteinases  exist,  no  explicitly  defined  mechanism  of  intermolecular  inhibition  of  aspartic 
proteinases  by  proteinaceous  inhibitors  has  been  defined.  However,  some  detailed 
analyses  of  the  intramolecular  inhibition  of  aspartic  proteinases  have  been  performed 
with  some  interestingly  varied  results. 

Serine  Proteinases  and  Inhibitors 

Serine  proteinases.  Serine  proteinases  are  a family  of  proteinases  that  include 
proteins  as  diverse  as  the  digestive  enzymes  trypsin  and  chymotrypsin  to  blood  clotting 
factors  (e.g.  Factor  Xa  and  thrombin)  to  the  processing  proteins  of  many  viruses, 
including  hepatitis  C virus  and  cytomegalovirus.  This  class  of  proteinases  generally 
requires  an  active  site  motif  of  a serine,  histidine,  and  aspartate  (the  catalytic  triad) 
required  for  the  enzymatic  mechanism  of  peptide  bond  hydrolysis.  The  substrate  is 
hydrolyzed  by  a nucleophilic  attack  from  the  serine  to  the  carboxyl  carbon  of  the  scissile 
peptide  bond,  followed  by  the  donation  of  a proton  from  the  imidazole  ring  of  histidine  to 
the  amide  nitrogen.  Two  classes  of  regulatory  serine  proteinase  inhibitors,  small, 
canonical  inhibitors  (e.g.  BPTI)  and  the  large  serpins,  and  some  xenoantigenic  inhibitors 


of  thrombin  are  described  below. 
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Kunitz  inhibitors.  The  canonical  inhibition  of  proteinases  has  been  thoroughly 
studied  with  the  bovine  pancreatic  trypsin  inhibitor  (BPTI).  The  Kunitz  (or  BPTI) 
protein  fold  is  composed  of  an  N-terminal  helix,  an  extended  loop,  an  antiparallel  (3-sheet 
and  a C-terminal  helix.  Three  disulfide  bonds  are  present  in  the  protein  fold,  including 
one  half-cystine  located  at  the  P2  position  in  the  reactive  site.  BPTI  contains  58  amino 
acid  residues,  and  the  highly  stable  structure  of  the  protein  is  due  to  the  disulfide  bonds 
and  the  tightly  packed  core  of  hydrophobic  residues,  with  polar  residues  on  the  surface. 
This  protein  fold  has  also  been  observed  multiple  times  in  some  larger  proteinase 
inhibitors;  tissue  factor  pathway  inhibitor,  a simultaneous  inhibitor  of  Factor  Xa  and 
Factor  Vila  (Broze  et  al.,  1988),  contains  three  Kunitz  domains.  The  canonical  model  for 
proteinase  inhibition  has  been  thoroughly  reviewed  (Laskowski  and  Kato,  1980).  The 
inhibitors  bind  to  the  enzyme  in  a substrate-like  manner  extending  through  the  active  site 
of  the  proteinase  target  in  an  antiparallel  P-sheet  formed  with  portion  of  the  N-terminal 
domain  of  the  proteinase.  A lysine  and  an  alanine  are  located  in  the  PI -PI’  positions  in 
the  active  site,  and  the  peptide  bond  between  the  two  residues  is  very  slowly  hydrolyzed 
by  the  protease  (Tschesche  and  Kupfer,  1976).  The  affinity  for  different  serine 
proteinases  differs,  and  dissociation  constants  for  the  inhibition  are  sub-nanomolar  for 
many  targets.  This  common  mode  of  inhibition  is  modified  by  the  larger  serine 
proteinase  inhibitors,  known  as  serpins. 

Serpins.  The  serpin  family  of  inhibitors  includes  proteins  that  assist  in  the 
regulation  of  the  cascades  of  blood  coagulation,  fibrinolysis,  and  complement  activation. 
These  glycosylated  proteins  are  composed  of  about  400  amino  acid  residues.  Serpins 
interact  with  target  proteinases  by  extending  a reactive  loop  into  the  active  site  of  the 
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proteinase  as  in  the  case  of  the  canonical  mechanism,  but  the  proteinase  hydrolyzes  a 
peptide  bond  of  the  reactive  loop.  A transient  covalent  complex  forms  through  an  ester 
bond  between  the  hydroxyl  oxygen  of  the  catalytic  serine  residue  and  the  carboxyl  carbon 
of  the  PI  residue  in  the  reactive  loop  of  the  serpin  (Gils  and  DeClerck,  1998). 
Additionally,  following  peptide  bond  hydrolysis,  the  loop  containing  the  covalent  Pl-Ser 
bond  undergoes  a translational  movement  distal  to  the  proteinase  active  site,  likely 
moving  the  entire  target  proteinase  (Wilczynska  et  al.,  1995,  Stratikos  and  Gettins,  1997). 
The  major  determinant  of  inhibitory  specificity  of  the  serpins  is  due  to  the  PI  residue  in 
the  reactive  site  loop.  Mutating  the  PI  residue  and  identifying  the  selectivity  of  the 
inhibitor  toward  various  targets  has  altered  target  specificity  and  affinity  (Heinz  et  al., 
1992,  Jiang  et  al.,  1995). 

Parasite  inhibitors.  Other  recent  interesting  examples  of  proteinase  inhibitors  of 
serine  proteinases  come  from  hematophagous  parasitic  organisms.  The  protein  hirudin 
has  been  isolated  from  the  saliva  of  the  leech  Hirudo  medicinalis  and  is  a specific 
inhibitor  of  thrombin,  biologically  preventing  a blood  clot  from  forming  while  taking  a 
blood  meal.  The  solved  complex  structure  of  hirudin  with  thrombin  by  x-ray 
crystallography  indicated  that  the  first  three  amino  acids  of  hirudin  form  hydrogen  bonds 
and  hydrophobic  interactions  in  the  S2  and  S3  subsites  of  thrombin  (Grutter  et  al.,  1990). 
Additional  contacts  are  made  to  an  acidic  surface  region  outside  of  the  carboxy-terminal 
end  of  the  active  site  of  thrombin  required  for  substrate  binding,  called  the  fibrinogen 
exosite.  The  basic  carboxyl  terminal  residues  of  hirudin  bind  this  surface  patch  with  salt 
bridges,  hydrogen  bonding,  and  hydrophobic  packing  to  assist  with  the  tight-binding 
reaction.  The  proteins  rhodniin  and  omithodorin  are  biological  inhibitors  of  thrombin 
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and  are  isolated  from  the  assassin  bug  Rhodnius  prolixus  and  tick  Ornithodoros  moubata. 
The  mechanism  of  thrombin  inhibition  for  these  proteins  is  similar,  even  though  these 
proteins  have  different  folds,  and  the  mechanism  differs  from  the  regulatory  serpin 
antithrombin  III.  Thrombin-inactivating  serpins  bind  through  the  active  site  with  a 
typical  serpin  mechanism,  but  these  require  the  cofactor  heparin  for  a 1 000-fold  increase 
in  reaction  rates  (Jordan  et  al.,  1980).  Hirudin,  rhodniin,  and  omithodorin  bind  partially 
through  the  active  site,  but  also  many  contacts  are  made  with  the  fibrinogen  exosite  of 
thrombin,  a region  of  specificity  for  the  substrate  fibrinogen  (van  de  Locht  et  al.,  1995, 
van  de  Locht  et  al.,  1996).  The  protein  triabin,  isolated  from  salivary  extracts  of  the 
triatomine  bug  Triatoma  pallidipennis , has  been  crystallized  with  thrombin  and  only 
binds  to  the  fibrinogen  exosite  region  of  thrombin  and  not  into  the  active  site  at  all 
(Fuentes-Prior  et  al.,  1997).  When  triabin  is  bound,  thrombin  is  still  able  to  hydrolyze 
bonds  of  small  peptides  but  not  of  natural  protein  substrates  due  to  the  steric  hindrance  of 
the  triabin.  All  of  these  parasite  thrombin  inhibitors  have  a stronger  affinity  for  the 
thrombin  than  the  natural  serpins  and  similar  affinity  as  the  serpin-heparin  inhibition  of 
thrombin. 

Cysteine  Proteinases  and  Inhibitors 

This  class  of  endopeptidases  consists  of  proteinases  that  are  critical  to  the  roles  of 
apoptosis,  lysosomal  protein  degradation,  and  extracellular  remodeling.  The  general 
mechanism  of  cysteine  proteinases  is  similar  to  that  for  the  serine  proteinases,  with  the 
exception  that  the  catalytic  serine  is  replaced  with  a cysteine  residue,  and  the  nucelophilic 
attack  forms  a thioacyl  intermediate.  The  improper  regulation  of  these  proteins  has  been 
implicated  in  pathologies  ranging  from  rheumatoid  arthritis  to  tumor  malignancy  to 
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neurodegenerative  and  neuromuscular  diseases  (Menard  and  El-Amine,  1996). 

Regulation  of  these  proteinases  includes  the  inactivation  by  inhibitors.  At  least  three 
families  of  cysteine  proteinase  inhibitors  have  been  structurally  examined,  the  stefms,  the 
cystatins,  and  the  kininogens  (Turk  and  Bode,  1991).  The  stefms  and  cystatins  are 
proteins  of  about  100  and  120  amino  acids.  The  kininogens  are  in  two  classes  of  68-kDa 
and  1 20-kDa  proteins,  and  both  classes  are  composed  of  three  cystatin-like  domains. 
These  inhibitors  all  tend  to  have  a similar  mechanism  of  extending  a wedge  into  the 
proteinase  active  site  to  block  the  S2  and  S 1 subsites  and  two  loops  that  interact  with  the 
Sr  and  S2’  subsites  (Stubbs  et  al.,  1990).  The  high  degree  of  sequence  similarity 
between  the  contact  sites  of  the  three  classes  of  cystatins  suggests  a similar  inhibitor 
mechanism  for  all  three. 

Metalloproteinase  Inhibitors 

Matrix  metalloproteinases  (MMPs)  are  known  to  function  in  the  extracellular 
matrix  for  tissue  remodeling  and  have  been  implicated  in  invasive  processes  of  metastatic 
cancer  cells.  Tissue  inhibitors  of  metalloproteinases  (TIMPs)  are  the  major  post- 
translational  regulators  of  many  MMPs,  and  these  inhibitors  are  critical  for  preventing 
tissue  damage  from  the  MMPs.  Four  forms  of  TIMPs  have  been  isolated  from  human 
extracellular  matrix,  possessing  between  1 84  to  1 95  amino  acids.  These  proteins  share 
42-52%  sequence  identity,  and  twelve  cysteine  residues  are  structurally  conserved  among 
the  TIMPs  (Tuuttila  et  al.,  1998).  TIMPs  target  most  matrix  metalloproteinases  with 
similar  sub-nanomolar  affinities  (Ward  et  al.,  1991,  Henriet  et  al.,  1999).  The  TIMP-1 
and  TIMP-3  proteins  are  glycosylated  while  the  TIMP-2  and  TIMP-4  lack  post- 
translational  modifications  (Henriet  et  al.,  1999). 
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Studies  of  the  inhibition  of  metalloproteinases  by  TIMPs  had  begun  before  a known 
structure  for  either  the  inhibitors  alone  or  complexes  with  proteinases  had  been 
developed.  Chemical  modifications  to  TIMP-1  were  performed  to  identify  residues  that 
contribute  to  the  interaction  with  stromelysin  (Williamson  et  al.,  1993).  The 
modifications  to  histidine  residues  by  diethyl  pyrocarbonate  prevented  the  inhibitor 
binding  to  stromelysin.  Site-directed  mutations  at  His  7 to  Glu,  Gin,  and  Ala  reduced  the 
inhibition  against  matrilysin  by  2-6  fold;  however,  no  other  histidine  mutation 
significantly  compromised  the  inhibitory  properties  of  TIMP-1  (O’Shea  et  al.,  1992). 
Other  residues  were  selectively  mutated,  and  the  Gln9 Ala  mutation  of  TIMP- 1 showed  a 
3-fold  reduction  in  matrilysin  affinity  while  maintaining  a similar  structural  stability,  by 
fluorescence  unfolding  characterization  (O'Shea  et  al.,  1992).  Crystallographic  studies  of 
complexes  indicated  those  residues  contact  the  target  protease  but  are  not  critical  for 
inhibition  (Gomis  Ruth  et  al.,  1997).  After  the  assignment  of  the  6 cystine  disulfide  pairs 
was  determined  for  the  TIMPs,  the  carboxy-terminal  third  of  the  TIMP-1  gene  was 
deleted,  leaving  a truncated  gene  encoding  the  first  126  amino  acids  (Murphy  et  al., 

1991).  This  truncated  protein  was  structurally  sufficient  for  the  function  as  a 
metalloproteinase  inhibitor;  however,  this  truncated  protein  had  a 2-3  fold  lower  affinity 
for  stromelysin  and  gelatinase- 1 than  the  full-length  protein  and  unaltered  affinities 
toward  collagenase  and  matrilysin. 

TIMPs  have  been  described  as  possessing  a modified  canonical  mechanism  for  the 
inactivation  of  target  proteinases,  and  at  least  two  complexes  between  TIMPs  and 
proteinases  have  been  crystallized  to  study  this  mechanism.  A model  of  non- 
glycosylated  TIMP- 1 with  the  catalytic  domain  of  stromelysin  has  been  determined 
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(Gomis-Ruth  et  al.,  1997),  as  has  the  complex  between  TIMP-2  and  the  catalytic  domain 
of  membrane  MMP-1  (Femandez-Catalan  et  al.,  1998).  The  mechanism  of  this 
interaction  appears  to  involve  multiple  contact  sites,  but  it  does  not  directly  bind  the 
catalytic  amino  acid  residues.  The  first  four  amino  acids  bind  within  the  active  site 
subsites  S1-ST-S2'-S3'  similar  to  an  extended  substrate.  The  a-amino  group  of  Cysl  also 
appears  to  disrupt  the  Zn  coordination  by  the  catalytically  "activated"  water  molecule 
from  the  active  stromelysin.  The  Ser68-Val69  residues  bind  into  the  S2-S3  subsites  in 
the  non-prime  side  of  the  active  site.  Additional  contacts  from  other  loops  were  shown  to 
aid  with  the  physical  contact  between  the  inhibitor  and  the  proteinase.  Major  contact 
sites  between  the  TIMP  and  MMP  are  due  to  residues  in  the  TIMP  that  are  conserved 
among  the  different  forms  of  TIMPs  and  are  likely  critical  for  the  same  mechanism 
among  this  class  of  inhibitors. 

Aspartic  Proteinases  and  Inhibitors 

Aspartic  proteinases  [EC  3.4.23]  comprise  a unique  class  of  proteolytic  enzymes 
that  are  characterized  by  having  acidic  isoelectric  points  and  maximal  activity  in  acidic 
environments.  These  proteins  predominantly  target  substrates  containing  hydrophobic 
residues  in  the  P 1 and  P 1 ’ positions.  Proteins  also  are  typically  inhibited  by  the  general 
aspartic  proteinase  inhibitor,  pepstatin.  Pepstatin  is  a statine-containing  peptide  derived 
from  the  Streptomyces  bacteria  (Umezawa  et  al.,  1970).  Aspartic  proteinases  also  have 
two  conserved  Asp-Thr/Ser-Gly  sequences,  one  in  each  domain.  The  hydrolysis  of 
peptide  bonds  is  catalyzed  by  the  two  aspartate  residues  in  the  active  site,  Asp32  and 
Asp215,  numbered  according  to  pepsin.  One  of  the  residue  side  chain  carboxylate  groups 
is  deprotonated  at  the  favored  acidic  conditions  of  these  enzymes  and  acts  as  a proton 
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acceptor  from  a water  molecule  in  the  active  site.  Simultaneously,  the  water  performs  a 
nucleophilic  attack  on  the  carbonyl  carbon  of  the  PI  position,  disrupting  the  peptide  bond 
to  the  Pr  residue  (Davies,  1990). 

These  enzymes  share  a common  tertiary  structure  of  two  domains  principally 
composed  of  P-sheets.  Between  the  two  domains  lies  the  active  site  of  the  enzyme.  The 
flap,  a p-hairpin,  covers  a portion  of  the  active  site  directly  over  the  S 1 -S 1’  subsites.  The 
catalysis  of  substrate  depends  on  the  properly  folded  conformation  of  the  domains  and 
the  arrangement  of  the  active  site  residues. 

Though  the  mechanisms  of  aspartic  proteinases  appear  to  be  similar  based  on 
primary  and  tertiary  structural  homologies,  great  differences  in  the  substrate  specificity 
exist  among  the  different  aspartic  proteinases.  The  enzyme  renin,  a protein  secreted  into 
blood  that  is  critical  for  the  maintenance  of  blood  pressure,  has  the  specific  substrate 
angiotensinogen.  An  enzyme  secreted  into  the  stomach,  pepsin,  is  rather  nonspecific  for 
the  digestion  of  proteins.  While  these  proteins  are  topologically  similar,  differences  in 
the  amino  acid  compositions  of  the  active  site  contribute  to  differences  in  catalytic  rates 
and  preference  of  substrates. 

All  eukaryotic  aspartic  proteinases  are  expressed  in  the  form  of  inactive  zymogen 
proteins.  These  amino  terminal  regions  include  signal  sequences  for  secretion  (e.g. 
pepsin)  or  organelle  targeting  (e.g.  cathepsin  D)  and  a propeptide  that  behaves  as  an 
intramolecular  inhibitor.  At  a neutral  pH,  the  central  portion  of  the  prosegments  tends  to 
be  associated  with  the  active  site  of  the  enzymes  (Khan  and  James,  1998).  The  basic 
propeptide  forms  additional  electrostatic  interactions  with  the  acidic  protein  surface. 
Electrostatic  interactions  are  neutralized  with  the  shift  in  pH  from  the  near  neutral 
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endoplasmic  reticulum  through  the  Golgi  to  the  more  acidic  compartments  of  the 
lysosomes  or  the  secretions  into  the  stomach.  The  pH  change  permits  complete 
autocatalytic  processing  for  the  gastric  enzymes  (Khan  et  al.,  1997)  but  other  enzymes 
require  assisted  catalysis  for  zymogen  activation  as  for  prorenin  processing  (Koelsch,  et 
al.  1994).  The  vacuolar  proteinase  plasmepsin  II  from  Plasmodium  falciparum,  a parasite 
that  causes  malaria,  does  not  appear  to  be  a zymogen  that  is  inactivated  by  occluding  the 
active  site.  The  124  amino  acid-long  prosegment  wraps  around  the  carboxyl  terminal 
domain  of  the  proteinase,  and  the  catalytic  aspartic  acid  residues  electrostatically  repel 
one  another  at  the  neutral  pH.  The  carboxyl  terminal  domain  is  separated  from  the  amino 
terminal  domain  by  14°  compared  to  the  active  conformation.  In  this  manner,  the 
catalytic  residues  are  too  far  separated  to  be  active  (Bernstein  et  al.,  1999). 

Few  natural  inhibitors  of  aspartic  proteinases  are  known  to  exist.  Aside  from  the 
prokaryote  inhibitor  pepstatin  and  the  prosegments  of  these  endopeptidases,  only  four 
other  classes  of  aspartic  proteinase  inhibitors  have  been  discovered.  Studies  have  shown 
that  a 68-amino  acid  protein  from  yeast  is  a selective  inhibitor  of  yeast  aspartic  proteinase 
A (Dreyer  et  al.,  1985).  Two  proteins  from  plants,  potato  and  tomato,  inhibit  cathepsin  D 
and  are  induced  to  expression  upon  physical  wounding  (Mares  et  al.,  1989,  Werner  et  al., 
1993).  An  unrelated  protein  from  squash  inhibits  pepsin  and  fungal  aspartic  proteinases 
(Christeller,  1998).  Proteins  from  nematodes,  Pl-3  horn  Ascaris  and  Onchocerca 
volvulus  Ov33  protein  (Girdwood  et  al.,  1998),  have  been  shown  to  inhibit  aspartic 
proteinases.  A growing  series  of  proteins  ^lnd  genes  from  filarial  parasitic  nematodes 
{Acanthocheilonema  viteae,  Brugia  malayi,  and  Dirofilaria  immitis)  and  the  soil 
nematode  Caenhorhabditis  elegans  have  been  discovered  that  are  related  to  the  PI-3  gene 
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product  (Figure  1-2)  (Willenbucher  et  al.,  1993,  Dissanayake  et  al.,  1993,  Hong  et  al., 
1996,  Martzen  et  al.,  1990,  Wilson  et  al.,  1994). 

Ascaris  Biology 

A variety  of  parasites  have  been  shown  to  express  inhibitors  of  different  host 
proteinases.  As  described  earlier,  many  characterized  inhibitors  have  been  isolated  from 
blood-feeding  parasites.  Proteinase  inhibitors  have  also  been  discovered 
from  internal  parasitic  nematodes.  Blood-borne  and  intestinal  nematodes  that  infect 
mammals  are  known  to  produce  proteinase  inhibitors.  Filarial  nematodes  survive  in  the 
blood  stream  for  the  majority  of  their  lives.  Intestinal  nematodes  may  migrate  through 
various  systems  of  the  body  during  the  larval  stages  but  survive  as  adults  in  the  intestines 
of  the  host  organism.  Among  this  second  class  of  nematodes  include  Ascaris 
lumbricoides  and  Ascaris  suum,  which  predominantly  infect  humans  and  pigs, 
respectively.  The  two  species  of  ascarids  are  quite  similar,  and  they  are  believed  to  share 
similar  life  eycles  in  the  different  host  species  (Murrell  et  al.,  1997).  These  roundworms 
can  survive  in  the  opposite  host;  therefore,  the  host-parasite  relationship  is  not  entirely 
species-specific  (Anderson,  1995). 

Predictions  of  infection  rates  project  the  occurrence  of  more  than  one  billion  cases 
of  ascariasis  in  humans  (de  Silva  et  al.,  1997).  The  vast  majority  of  these  occur  in 
tropical  regions  where  poor  sanitation  and  overpopulation  are  prevalent.  Many 
individuals  infested  with  the  roundworms  have  multiple  worms  in  their  intestines,  and  a 
large  portion  is  likely  to  be  infested  with  one  or  more  other  types  of  parasites 
(Kightlinger  et  al.,  1995).  Less  than  one  percent  of  cases  result  in  pathologies  ranging 
from  an  overburden  of  worms  to  adult  worms  that  migrate  to  other  organs  like  the 
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pancreas,  liver,  or  gall  bladder  (de  Silva  et  al.,  1997).  However,  Ascaris  infections 
typically  affect  children,  and  the  presence  of  the  worms  can  contribute  to  malnutrition 
and  retard  the  physical  development  of  children  who  are  hosts  to  these  nematodes. 

The  life  cycle  of  the  nematodes  follows  a number  of  stages  from  the  reproductive 
adults  living  in  the  host  small  intestinal  tract  that  produce  viable  eggs  to  the  growth  of  the 
egg  through  larval  stages  and  into  mature  worms  again.  The  adult  worms  reproduce 
sexually,  and  the  female  releases  the  eggs  that  are  deposited  in  the  feces  of  the  host. 

Eggs  can  remain  viable  for  many  years  while  in  the  soil.  Hosts  are  infected  from 
ingesting  Ascaris  eggs.  Larvae  shed  the  egg  casings  after  passing  through  the  stomach 
and  into  the  small  intestine.  Second  stage  larvae  burrow  into  the  walls  of  the  small  and 
large  intestines  and  enter  the  bloodstream.  These  worms  pass  through  the  liver  and  to  the 
lungs  via  the  bloodstream.  Molting  into  third  stage  larvae  in  the  lungs,  the  worms 
produce  irritants  that  induce  the  host  to  cough.  The  act  of  coughing  brings  the  larvae  up 
from  the  bronchi  of  the  lungs  and  into  the  throat,  and  the  worms  are  then  swallowed. 

This  second  passage  through  the  gastrointestinal  tract  concludes  in  the  small  intestines 
again,  where  the  worms  reach  reproductive  maturity  and  remain  throughout  their  lives. 

Ascaris  inhibitors.  These  roundworms  are  known  to  produce  a number  of 
proteinase  inhibitors  believed  to  assist  with  the  organisms'  survival  in  the  digestive 
enzyme-rich  intestinal  tract.  These  proteinaceous  inhibitors  show  specific  affinity  toward 
carboxypeptidase  A and  B,  trypsin,  chymotrypsin  and  elastase,  and  pepsin  (Peanasky  et 
al.,  1987).  These  proteins  are  produced  in  large  numbers  and  are  present  in  the  worm 
digestive  tract,  perhaps  to  protect  the  worms  from  the  digestion  of  cell  surface  proteins  or 
to  regulate  the  substrate  cleavage  of  proteins  for  the  worm’s  nutritional  requirements. 
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The  protease  inhibitors  are  not  expressed  in  equivalent  amounts  by  the  worm.  While  the 
chymotrypsin  and  trypsin  inhibitors  are  expressed  in  large  numbers,  the  pepsin  inhibitor 
is  produced  at  less  than  1%  the  level  of  the  former  inhibitors  (Martzen  et  al.,  1990). 

These  different  inhibitors  have  been  studied  in  situ  and  in  vitro.  The 
chymotrypsin/elastase  and  trypsin  inhibitors  have  been  isolated  to  the  intestinal  cells  and 
cells  lining  the  cuticle  and  genital  tracts  of  the  worms  by  antibody  localization  (Martzen 
et  al.,  1985).  And  these  different  inhibitors  have  been  detected  by  immunofluorescence 
in  the  different  stages  of  the  larvae  as  well  as  in  the  adult  (Martzen  et  al.,  1986).  The 
inhibitors  had  been  collected  from  the  worms  obtained  either  from  culture  in  the 
laboratory  or  from  an  abattoir.  The  greater  expression  of  the  carboxypeptidase,  trypsin, 
and  chymotrypsin/elastase  inhibitors  made  the  process  of  identification  and  purification 
of  the  native  protein  simpler  than  for  the  pepsin  inhibitor  (Martzen  et  al.,  1990).  The 
chymotrypsin/elastase-I  inhibitor  has  been  crystallized  and  a three-dimensional  model 
has  been  constructed  from  the  data  (Huang  et  al.,  1994). 

The  pepsin  inhibitor  was  purified  from  the  nematodes  and  shown  to  have  a 
preferential  affinity  for  pepsin  and  gastricsin  (Abu-Erreish  and  Peanasky,  1 974b).  The 
initial  purification  included  the  four  chromatographic  peaks  from  a DEAE  Sephadex 
column,  and  all  peak  fractions  contained  functional  pepsin  inhibitors.  The  most  abundant 
of  these  inhibitors  was  the  third  peak  of  the  protein,  hence  the  name  pepsin  inhibitor-3, 
PI-3 . The  protein  could  be  digested  by  trypsin  and  chymotrypsin,  suggesting  that  the 
protein  contained  some  flexible  structural  elements.  PI-3  also  had  weaker  affinities  for 
cathepsin  E and  no  detectable  inhibition  against  fimgal  enzymes  and  cathepsin  D 
(Keilova  and  Tomasek,  1972,  Jupp,  et  al.,  1988,  Valler  et  al.,  1985).  The  amino  acid 
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Figure  1-3.  Primary  Sequence  of  PI-3  Showing  the  N-Terminal  Leader  Sequence 

The  first  twenty  amino  acids  are  a signal  sequence  and  were  discovered  by  Kageyama 
(1998).  The  active  protein  discovered  by  Peanasky’s  group  begins  with  Glnl,  shown 
in  bold.  Martzen  et  al.  (1990)  determined  the  disulfide  pairing  for  the  native  protein, 
and  the  cysteine  residues  are  numbered  for  reference. 
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sequence  of  the  protein  was  determined  for  the  1 49  amino  acid  residues  by  selective 
chemical  and  enzymatic  digestion  of  the  PI-3  followed  by  sequencing  the  different 
fragments  of  the  protein  (Martzen  et  al.,  1990).  The  protein  was  also  shown  to  contain 
six  cysteine  residues  forming  three  disulfide  bonds  seen  in  Figure  1-3  (Martzen  et  al., 
1990). 

Analysis  of  chemical  modifications  to  a recombinant  PI-3  suggested  that  lysine 
residues  could  be  derivatized  by  dansyl  chloride  or  phenylisothiocyanate  with  a reduction 
in  the  inhibition  of  pepsin  (Kageyama,  1998).  Analysis  of  specific  digests  of  the 
modified  inhibitor  indicated  that  the  carboxyl  terminal  half  of  PI-3  had  more  lysine 
residues  modified  by  the  chemical  treatment,  suggesting  that  more  of  these  residues  were 
surface  exposed  and  potentially  involved  in  specific  binding  of  pepsin.  Additionally 
Kageyama  showed  that  the  gene  encoding  PI-3  includes  a 20-amino  acid  amino-terminal 
segment  that  appears  to  be  a secretion  signal  (Figure  1-3). 

The  work  described  here  began  with  the  construction  of  a recombinant  protein 
system  from  which  the  pepsin  inhibitor-3  could  be  purified  to  homogeneity.  This 
recombinant  protein  has  been  used  to  address  the  questions  of  how  the  protein  forms  a 
compact,  tertiary  structure,  what  structures  contribute  to  the  specificity  of  inhibition  , and 
what  are  the  properties  of  the  mechanism  of  inhibition.  The  protein  has  been 
characterized  as  to  the  preferences  of  the  inhibitor  toward  different  aspartic  proteinases. 
Physical  characteristics  of  the  stability  of  the  protein  were  explored  to  better  understeind 
the  stability  of  the  folded  state  of  the  protein.  Mutations  to  amino  acid  residues  on  the 
protein  were  developed  to  attempt  to  identify  residues  that  were  critical  to  the  mechanism 


of  inhibition. 


CHAPTER  2 

EXPERIMENTAL  PROCEDURES  FOR  THE  ANALYSIS 
OF  THE  RECOMBINANT  PROTEINASE  INHIBITOR-3,  RPI-3 

Introduction 

Experiments  with  the  PI-3  proteinase  inhibitor  required  the  development  of  a 
recombinant  expression  system  in  which  the  protein  could  be  purified  to  homogeneity. 
Analysis  of  the  protein  was  aimed  at  understanding  the  physical  mechanisms  behind  the 
functional  selectivity  and  the  structural  stability  of  the  inhibitor.  This  chapter  describes  i) 
the  sub-cloning  of  the  gene  into  an  expression  vector  ii)  the  purification  of  rPI-3,  iii)  site- 
directed  mutagenesis  experiments  on  the  gene,  iv)  analysis  of  the  mutant  recombinant 
proteins,  v)  kinetic  assays  for  the  different  modes  of  inhibition,  and  vi)  biophysical 
analyses  of  the  protein  by  fluorescence  spectrometry,  mass  spectrometry,  and  circular 
dichroism  spectrometry. 

Materials 

The  Ascaris  cDNA  was  a gift  from  Tim  Nielsen  (Case  Western  Reserve  University, 
Cleveland).  The  isolation  and  incorporation  of  the  gene  into  the  pGEM-T,  pHIL-Sl,  and 
pET-22b(+)  (Figure  2-1)  vectors  was  completed  by  Bill  Farmerie,  Rich  Rogers,  and 
Meredith  Barman  (University  of  Florida,  Gainesville).  Oligonucleotides  were  obtained 
either  from  the  University  of  Florida  Interdisciplinary  Center  for  Biotechnology  Research 
(UF-ICBR)  DNA  Synthesis  Core  Facility  or  from  Gibco-BRL  (Gaithersburg).  Molecular 
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biology  enzymes  were  purchased  from  New  England  Biolabs  (Beverly)  or  Promega 
(Madison)  and  used  according  to  the  manufacturers’  suggestions.  Sequencing 
radiochemicals  were  purchased  from  Amersham  Pharmacia  (Piscataway).  The  fragment 
peptides,  PI-3a,  PI-3aI,  and  PI-3a-amide,  were  synthesized  by  Jens  Petersen  (University 
of  Alberta,  Edmonton).  Chromogenic  substrates  were  synthesized  using  solid  phase 
methods  with  an  Applied  Biosystems  Model  43 2 A automated  peptide  synthesizer  by  the 
UF-ICBR  Protein  Chemistry  Core.  Amino  acid  analyses  were  performed  by  the  UF- 
ICBR  Protein  Chemistry  Core  on  a Beckman  System  6300  High  Performance  Analyzer 
following  acid  hydrolysis.  Amino-terminal  sequencing  was  completed  by  the  UF-lCBR 
Protein  Chemistry  Core  using  Applied  Biosystems  470  or  473A  Protein  Sequencers. 
Isoelectric  focusing  electrophoresis  was  performed  on  a Fast-Gel  apparatus  by  the  UF- 
ICBR  Protein  Chemistry  Core.  Proteolytic  enzymes  used  in  the  study  came  from  a 
variety  of  sources.  Pepsin  and  human  cathepsin  D were  purchased  from  Sigma  (St. 
Louis).  Human  cathepsin  E was  a gift  from  John  Kay  (College  of  Cardiff,  Wales). 
Plasmepsin  II  was  a recombinant  protein  purified  by  Jennifer  Westling  (University  of 
Florida).  Rhizopuspepsin  and  PpnN-rhzC  chimera  proteins  were  expressed  and  purified 
by  Deepa  Bhatt  (University  of  Florida).  All  other  chemicals  were  obtained  from  Fisher 
(Pittsburgh)  unless  stated  otherwise  below. 

Methods 

Sub-cloning  the  Pi-3  Gene 

The  pi-3  gene  isolated  from  Ascaris  suum  cDNA  had  been  cloned  previously  into 
the  vector  pGEM-T  and  subsequently  into  the  expression  vectors  pHIL-Sl  and  pET- 
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22b(+)  (Figure  2-1).  The  gene  was  sub-cloned  into  another  bacterial  vector  expression 
system,  pET-3d  from  Novagen  as  described  below  (Figure  2-2). 

The  pET-22b(+)  DNA  was  amplified  using  two  primers  designed  to  contain  a 
55/?HI  restriction  site  (in  italics)  at  the  5’  (5’-CTC  GCT  GCC  CAG  CCG  GTC  ATG  ^CT 
CAG  TTC  CTG  TTC  TCC  ATG-3’)  and  3’  (5’-TTG  CTG  ATA  CAA  CTT  TCA  TGA 
TTA  TTA  TTG  TAG  AGT  GCA  AAA-3’)  ends  of  the  pi-3  gene.  The  reactions  were 
performed  on  a Gene  Machine  II  Thermocycler  (USA  Scientific  Plastics).  The  reactions 
were  performed  with  1 pg  of  plasmid  DNA,  0.4  mM  dNTPs,  2 pM  of  the  primers,  5 pi  of 
the  lOx  reaction  buffer,  1 pi  of  Tag  DNA  polymerase  (5  U),  and  dH20  up  to  50  pi.  The 
gene  was  amplified  during  25  cycles  of  the  following  temperature  profile:  94°C  for  30 
seconds,  62°C  for  50  seconds,  and  72°C  for  40  seconds.  After  the  final  round,  the  72°C 
extension  temperature  was  maintained  for  an  additional  7 minutes  to  ensure  complete 
elongation  of  the  double-stranded  DNA. 

Digestion  and  ligation.  A preparation  of  the  amplified  DNA  was  digested  with  the 
enzyme  BspYil  in  a reaction  containing  33  pi  of  the  DNA,  5 pi  of  NEB  Buffer  4,  and  2 pi 
of  55/?H1  (10  U).  The  fragments  were  separated  on  a 1.2%  low-melt  agarose  gel  (Nu- 
Sieve)  and  electrophoresed  in  TBE  buffer  (0.05  M Tris,  0.05  M Borate,  1 mM  EDTA). 
The  fragment  band  corresponding  to  the  size  of  the  5jpHI-digested  region  was  excised 
from  the  gel  and  purified  (Bio-Rad  DNA  Purification  Kit).  The  pET-3d  vector  was 
digested  with  the  endonuclease  Ncol  using  4 pi  of  pET-3d,  2 pi  of  NEB  Buffer  4,  2 pi  of 
Ncol  (10  U),  and  sterile  water  to  a final  volume  of  20  pi.  The  digested  plasmid  was 
treated  with  calf  intestinal  phosphatase  (CIP)  at  50°C  for  one  hour  to  prevent  the  self- 
ligation of  the  plasmid  DNA  upon  itself.  The  reactions  involved  the  addition  of  1 pi  of 
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BamHi 


Xbal 


Figure  2-1.  Vector  Map  of  pET-22b(+)/pi3 

Vector  contains  bleomycin  gene  conferring  ampicillin  resistance  (Amp)  to  E.  coli  hosts  in 
which  the  plasmid  has  been  transformed.  Expression  of  the  gene  pil>  is  controlled  by  the 
lacl  gene  and  the  T7  promoter  (P  T7)  elements.  The  origin  of  replication  (ori)  comes 
from  the  parent  plasmid,  pBR322. 
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Figure  2-2.  Vector  Map  of  pET-3d/pi3 

Plasmid  contains  the  pBR322  origin  of  replication  (ori),  bleomycin  gene  for  ampicillin 
resistance  (Amp),  and  the  T7  promoter  (P  T7)  for  efficient  transcription  of  the  pil>  gene. 
The  Ncol  restriction  site  is  shown  for  the  purpose  of  the  relation  to  the  Xbal  site.  The 
Ncol  site  was  modified  in  the  sub-cloning  procedure. 
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the  phosphatase  (10  U)  to  the  digest  tube.  CIP  was  inactivated  by  adding  0.4  pi  of  0.5  M 
EDTA  and  heating  the  sample  to  75°C  for  20  minutes.  The  resulting  overhanging  single- 
stranded  DNA  for  TVcoI  is  complementary  to  the  single-strand  overhang  generated  by 
BspWl  digestion.  The  digested  insert  gene  and  plasmid  DNA  (15:1  volume  ratio)  were 
ligated  together  in  a 40  pi  overnight  reaction  at  16°C  including  4 pi  of  T4  ligase  buffer 
containing  1 mM  dATP  and  1 pi  T4  DNA  ligase  (0.15  Weiss  Units).  Ligase  was 
inactivated  at  65°C  in  a water  bath  for  10  minutes,  and  1 pi  of  the  ligation  reactions  were 
used  to  transform  HMS174  E.  coli  competent  cells  (Novagen). 

Transformation.  The  ligated  product  was  mixed  with  20  pi  of  the  cell  stock  and 
kept  on  ice  for  30  minutes.  Reaction  tubes  were  then  placed  into  a 42°C  water  bath  for 
40  seconds  and  immediately  transferred  to  the  ice.  After  two  minutes  on  ice,  an  addition 
of  1 00  pi  of  SOC  media  was  mixed  with  the  sample.  Samples  were  then  placed  into  an 
incubator  rotating  at  250  rpm  at  37°C  for  one  hour.  Following  the  incubation,  40-80  pi 
of  the  transformants  were  transferred  to  LB-ampicillin  (50  pg/ml)  plates  and  spread 
around  the  plates.  Plates  were  stored  overnight  in  a 37°C  incubator  to  permit  the  growth 
of  bacterial  colonies. 

Plasmid  preparation  and  analysis.  Colonies  from  the  plate  were  randomly  chosen  to 
inoculate  3 ml  of  LB  media  (one  liter  prepared  with  1 0 gm  bacto-tryptone,  5 gm  yeast 
extract,  and  10  gm  NaCl,  brought  to  pH  7.0)  with  50  pg/ml  ampicillin.  The  culture  tubes 
were  shaken  for  14-16  hours  at  300  rpm  and  37°C.  Plasmids  were  purified  from  the  cells 
following  alkaline  lysis  of  cells  in  the  presence  of  RNase  and  acid  precipitation  of 
cellular  components  (Promega  Wizard  Mini-Prep  or  Qiagen  Spin  Mini-Prep  Kits).  Small 
DNA  were  retained  in  the  supernatant  and  purified  over  a proprietary  DNA  binding 
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column.  DNA  were  washed  with  55%  or  75%  ethanol,  dried,  and  eluted  from  the 
columns  by  water.  Purified  DNA  plasmids  were  analyzed  by  1 % high-melt  agarose  gel 
electrophoresis.  The  plasmids  were  also  analyzed  for  correct  incorporation  of  the  gene 
by  digestion  with  two  enzymes  that  are  uniquely  present  in  the  plasmid,  Cla\  and 
These  products  were  also  visualized  by  agarose  gel  electrophoresis.  Following  these 
analyses,  the  DNA  plasmids  were  transformed  into  another  E.  coli  cell  line,  BL21(DE3) 
(Novagen),  for  protein  expression.  The  transformation  procedure  was  performed  as 
before.  Resulting  colonies  on  the  plates  were  used  to  inoculate  LB-ampicillin  media. 
After  growing  overnight,  a sample  of  800  pi  of  cells  containing  the  pET-3d/pi3  vector 
was  mixed  with  200  pi  sterile  glycerol  and  stored  at  -80°C  for  use  as  a frozen  stock  for 
DNA  sequencing  and  protein  expression. 

Mutagenesis 

Development  of  site-directed  mutants  of  the  reeombinant  PI-3  was  accomplished 
using  either  of  two  methods.  The  first  four  mutations  to  the  gene  were  generated  by  the 
PCR-based  overlap  extension  mutagenesis  method  (EIo,  1 989).  This  method  requires 
four  oligonucleotide  primers  to  be  used  in  pairs  as  deseribed  below  and  depicted  in 
Figure  2-3.  Two  primers  (the  internal  primers)  are  generated  with  a short  sequence  of 
complementary  overlap  and  the  mutation  encoded.  The  other  two  oligonucleotides 
(external)  are  synthesized  complementary  to  sequences  beyond  the  5'  and  3'  ends  of  the 
gene.  The  external  primers  were  also  designed  to  be  inclusive  of  the  unique  restriction 
sites,  Xbal  and  RamHI,  respectively,  to  allow  for  the  digestion  of  the  amplified  DNA 
fragment  for  religation  into  the  vector.  These  oligonucleotide  primers  are  listed  in  Table 
2-1.  The  reactions  were  performed  with  1 pi  of  plasmid  DNA  (1-3  pg),  0.4  mM  dNTPs, 
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Figure  2-3.  Site-Directed  Mutagenesis  Scheme  1 : PCR-Directed  Mutagensis 

Multiple  rounds  of  the  polymerase  chain  reaction  were  performed.  In  step  1 , 
two  separate  sets  of  reactions  amplified  the  plasmid  sequence  between  primers 
A and  B (la.)  and  primers  C and  D (lb.).  Purified  fragments  from  reactions  la 
and  1 b were  combined  in  another  reaction  using  the  external  primers  A and  D, 
which  anneal  to  only  half  of  the  possible  combinations  of  overlapping  fragments 
(2).  Final  products  should  contain  the  mutation  (♦)  of  both  strands  (3). 


31 


Table  2-1.  Oligonucleotide  Primers  Generated  for  Mutations  at  Residues  K72  and  K85 


Reaction® 

Oligonucleotide  Primer  Sequence  15’  - 3’) 

Strand” 

All 

CTA  TAG  GGA  GAC  CAC  AAC  GGT  TTC  C 

Primer  A 

All 

GCT  TTG  TTA  GCA  GCC  GGA  TCC  GAC  C 

Primer  D 

K85L,K85E 

ATT  GCC  CAA  AAC  CGA  GCA  GCC  AA 

Primer  B 

K72L,K72E 

CAT  ATC  TTG  TGG  ACC  ACA  GAA  TGG 

Primer  B 

K72L 

CAC  AAG  ATA  TGC  TGA  TGT  TTA  AAT  TCG  TTG 

Primer  C 

K72E 

CAC  AAG  ATA  TGG  AGA  TGT  TTA  AAT  TCG  TTG 

Primer  C 

K85E 

CCA  CAA  GAT  ATG  GAG  ATG  TTT  AAT  TTC  G 

Primer  C 

K85L 

CCA  CAA  GAT  ATG  CTG  ATG  TTT  AAT  TTC  G 

Primer  C 

^ Primer  reaction  refers  to  those  mutagenesis  PCR  reactions  for  which  the  primer  was 
used.  ” Primer  strand  refers  to  the  position  of  the  primer  according  to  the  scheme  in 
Figure  2-3.  Underlined  codon  is  the  mutation  site  for  the  C primers. 
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1 |j,l  of  the  first  and  second  primers  (2  pM),  5 pi  of  the  1 Ox  reaction  buffer,  supplemented 
with  0-2  mM  MgCb,  1 pi  of  Taq  DNA  polymerase  (5  U),  and  dH20  up  to  50  pi  to 
amplify  copies  of  the  5’  end  of  the  gene  segment.  The  same  components  were  used  to 
generate  the  3 ’ segment;  however,  the  third  and  fourth  primers  were  used  instead  of  the 
first  two.  Reactions  were  facilitated  by  the  Gene  Machine  II  Thermocycler  and  began  by 
ramping  the  temperature  to  94°C  for  40  seconds,  at  which  the  DNA  were  denatured.  The 
temperature  was  dropped  to  60-62°C  for  50  seconds  for  annealing  primers,  and  the 
temperature  was  brought  to  72°C  for  40  seconds  to  elongate  the  DNA  polymers.  The  5’ 
and  3 ’ fragments  were  generated  through  25  cycles  of  amplification.  DNA  samples  were 
separated  on  a 1 .3%  low-melting  agarose  gel,  and  bands  of  approximately  280  bp  were 
excised  and  purified  with  a DNA  binding  resin  (Bio-Rad  Prep-a-Gene  Purification  Kit) 
and  subsequent  ethanol  washes  and  a water  elution.  A third  round  of  PCR  amplification 
was  performed  using  the  external  primers  and  the  two  overlapping  segments  of  double- 
stranded  DNA  as  the  template  for  polymerization.  Equivalent  amounts  of  the  two  DNA 
fragments  were  mixed  with  the  two  external  primers,  while  most  other  conditions  were 
the  same.  While  Taq  polymerase  was  used  in  the  first  two  sets  of  PCR  to  generate  the 
mutation.  Vent  polymerase  was  used  in  this  final  set  of  PCR  due  to  its  greater  accuracy  of 
nucleotide  incorporation.  The  amplified  DNA  fragments  were  purified  via  low-melting 
agarose  gel  electrophoresis,  and  bands  were  cleaned  from  the  gel  as  before.  This  DNA 
was  digested  with  the  appropriate  enzymes,  Xbal  and  BamVil.  The  plasmid  pET-3d  was 
also  digested  with  these  restriction  enzymes.  These  were  purified  according  to  the  Bio- 
Rad  Prep-A-Gene  protocol.  Plasmid  DNA  were  treated  with  calf  intestinal  phosphatase 
in  a one-hour  reaction  at  50°C  using  the  CIP  buffer  and  1 p.1  CIP  (10  U).  Phosphatase 
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was  inactivated  by  addition  of  0.05  M EDTA  and  heating  sample  to  75°C  for  20  minutes. 
A ligation  of  the  mutated  gene  with  the  plasmid  was  accomplished  with  an  overnight 
reaction  at  16°C  using  a 3:1  ratio  of  insert  to  plasmid,  1 mM  dATP,  2 pi  of  T4  DNA 
ligase  buffer,  and  1 pi  of  T4  DNA  ligase  (0.15  Weiss  Units).  As  before,  ligase  was 
inactivated  before  plasmids  were  transformed.  These  products  were  transformed  into  an 
E.  coli  host,  HMS174,  and  plated  onto  LB-ampicillin  plates.  These  genes  were 
sequenced,  and  identified  clones  were  retransformed  into  an  expression  cell  line  BL21 
(DE3). 

For  the  second  method  of  site-directed  mutagenesis,  the  Stratagene  Quik-Change 
Kit  was  used,  and  the  method  is  depicted  in  Figure  2-4.  Generating  the  mutations 
required  two  primers  made  complementary  to  each  other,  thus  able  to  anneal  to  the  two 
strands  of  a plasmid  DNA  at  the  same  site  (Table  2-2).  The  mutated  bases  are 
synthesized  on  both  of  the  oligonucleotide  primers.  Reactions  were  performed  using  1 pi 
of  plasmid  DNA,  5 pi  of  Pfu  polymerase  buffer,  1 pi  of  dNTPs,  1 .25  pi  of  the  each  of  the 
two  primers,  and  1 pi  of  the  Pfu  polymerase  (2.5  U).  Reactions  were  performed  on  a 
Biometra  Uno  II  thermocycler.  The  temperature  was  initially  brought  to  95  °C  for  30 
seconds.  Temperatures  were  then  cycled  15  times  from  95°C  for  30  seconds,  to  55°C  for 
1 minute,  and  to  68°C  for  10  minutes.  After  the  cycles  ended,  the  temperature  returned  to 
4°C,  and  samples  were  removed  to  ice.  An  addition  of  1 pi  of  the  restriction  nuclease 
Dpn  I (10  U)  was  made  to  the  reactants.  The  nuclease  reactions  were  continued  for  an 
hour  at  37°C.  Samples  were  then  transformed  into  the  XLl  Supercompetent  E.  coli  cells. 
Transformations  were  performed  by  the  addition  of  1 pi  of  the  reactions  with  10-20  pi  of 
thawed  cell  suspension.  After  remaining  on  ice  for  30  minutes,  the  samples  were  heat 
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Figure  2-4.  Site-Directed  Mutagenesis  Scheme  2:  Stratagene  Quik-Change  Method 

1 . Double-stranded  plasmid  DNA  is  denatured,  and  two  primers,  complementary  in 
sequence  and  containing  the  mutation  ( ♦ ),  are  annealed  to  opposite  strands  of  the 
template.  2.  Second  strand  synthesis  is  completed  by  Pfu  polymerase,  resulting  in  hybrid 
mutated  plasmids.  3.  A second  round  of  PCR,  as  in  steps  1 and  2 above,  yields  double- 
strand mutated  DNA  (shown)  and  hybrid  plasmid.  Additional  rounds  of  PCR 
exponentially  increase  the  amount  of  double-strand  mutated  DNA.  Treatment  with  Dpn\ 
should  digest  any  methylated  DNA. 


35 


Table  2-2.  Oligonucleotide  Primers  for  Mutagenesis  by  the  Quik-Change  Method 


Reaction 

Oligonucleotide  Primer  Sequence  ("S’  - 3") 

Strand^ 

KllOL 

CGT  TCA  GAG  AAC  TGA  TAG  CCG  CCT  TC 

+ 

KllOL 

GAA  GGC  GGC  TAT  CAG  TTC  TCT  GAA  CG 

- 

K91E 

CAT  AGA  TCA  GGA  AT  A TGT  TCG  TGA  CC 

K91E 

GGT  CAC  GAA  CAT  ATT  CCT  GAT  CTA  TG 

- 

KllOE 

CGT  TCA  GAG  AAC  AAA  TAG  CCG  CCT  TC 

+ 

KllOE 

GAA  GGC  GGC  TAT  TTC  TTC  TCT  GAA  CG 

- 

K91L 

CAT  AGA  TCA  GGC  CTA  TGT  TCG  TGA  CC 

K91L 

GGT  CAC  GAA  CAT  ACC  CCT  GAT  CTA  TG 

- 

L82A 

GGC  TGC  TCA  GTT  CCC  GGC  AAT  AAA  TTA  TTC  ATA  G 

L82A 

CTA  TGA  ATA  ATT  TAT  TGC  CCC  CAA  CTG  AGC  AGC  C 

- 

AT(-l) 

GGA  GAT  ATA  CCA  TGC  AGT  TCC  TGT  TCT  C 

-h 

AT(-l) 

GAG  AAC  AGG  AAC  TGC  ATG  GTA  TAT  CTC  C 

— 

® Primer  strand  refers  to  the  direction  of  the  polymerization,  + is  sense  to  the  gene  and  - 
is  antisense  to  the  gene. 
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shocked  in  a 42°C  water  bath  for  45  seconds.  The  samples  were  immediately  returned  to 
the  ice  for  two  minutes  before  adding  60  pi  of  SOC  media  to  the  tubes.  These  were 
incubated  in  a shaker  rotating  at  300  rpm  for  one  hour  at  37°C.  The  samples  were 
removed  from  the  shaker,  pulse  spun  in  a microcentrifiige,  and  one-half  to  two-thirds  of 
the  media  were  spread  onto  a LB-ampicillin  plate.  Plates  were  incubated  overnight  at 
37°C. 

DNA  Sequencing 

The  genes  in  the  plasmid  DNA  were  sequenced  to  determine  the  presence  of  the 
correct  mutations  and  to  ensure  the  absence  of  any  spurious  mutations.  The  methods  of 
sequencing  were  based  on  the  dideoxy-chain  termination  method  (Sanger  et  al.,  1977). 
Samples  were  either  sequenced  manually  or  prepared  and  sequenced  at  the  University  of 
Florida  DNA  Sequencing  Core  Laboratory.  Both  methods  required  the  generation  of  a 
primer  located  5’  to  the  gene  start  codon  (5'-CGA  TCC  CGC  GAA  ATT  AAT  AC)  and, 
complementary  to  the  opposite  strand,  a primer  located  downstream  from  the  translation 
termination  sequence  (primer  D of  Table  2). 

When  manually  sequenced,  the  Sequenase  2 (US  Biochemicals)  protocol  was 
employed.  Freshly  prepared  DNA  samples  were  denatured  in  a small  microcentrifuge 
tube  with  3-5  pg  DNA  mixed  with  an  equal  volume  of  water  (10  pi)  and  2 pi  of  a fresh  2 
M NaOH  and  2 mM  EDTA  solution.  DNA  were  incubated  at  37°C  for  30  minutes,  and 
2 pi  of  2 M sodium  acetate  pH  4.6  were  mixed  with  the  DNA.  Addition  of  75  pi  of  cold 
absolute  ethanol  to  the  samples  was  made  to  precipitate  the  DNA  and  samples  were 
mixed  and  placed  into  a -70°C  freezer  overnight.  The  next  day,  samples  were  centrifuged 
at  12,000  rpm  for  10  minutes.  Supernatants  were  decanted,  and  50  pi  of  cold  70% 
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ethanol  was  added  to  resuspend  the  pellets.  Another  centrifugation  for  1 0 minutes  at 
12,000  rpm  was  performed  before  decanting  the  supernatant  and  drying  at  37°C. 

DNA  samples  were  suspended  in  7 pi  of  water,  2 pi  of  Sequenase  reaction  buffer, 
and  1 pi  of  primer  (0.5  pmol)  to  anneal  the  primer  to  the  template.  Tubes  were  heated  in 
the  thermocycler  for  2 minutes  at  65°C  and  the  temperature  was  ramped  down  to  23°C 
over  30  minutes'  time.  Tubes  were  returned  to  ice  to  cool  the  reactants. 

The  dGTP  labeling  mix,  including  7.5  pM  of  dGTP,  dATP,  dTTP  and  dCTP,  was 
diluted  1 :5  for  use  in  all  of  the  reactions.  The  Sequenase  enzyme  (13  U)  was  prepared 
into  the  dilution  buffer  and  pyrophosphatase  (0.005  U)  in  a 1 ;6.5:0.5  volume  ratio.  A 
preparation  of  0.1  M DTT,  100  pCi  [a-^^S]-dATP,  the  dilute  dGTP  labeling  mix,  and  the 
prepared  Sequenase  mix  were  made  on  ice  in  a 1:1:2:2  volume  ratio,  and  6 pi  of  this 
mixture  was  added  to  each  of  the  template-primer  DNA  reaction  tubes.  The  tubes  were 
kept  at  room  temperature  for  5 minutes. 

For  each  of  the  primer  reactions,  four  lanes  in  a multi  well  plate  were  prepared  for 
termination  by  each  base,  G,  A,  T,  and  C,  and  2.5  pi  of  the  appropriate  dideoxynucleotide 
termination  mix  was  added  to  the  wells.  Each  termination  mixture  contained  80  pM  of 
each  of  the  four  deoxynucleotides,  50  mM  NaCl,  and  8 pM  of  the  appropriate 
dideoxynucleotide.  The  DNA  reaction  mixtures  were  distributed  into  the  multiwell  plate 
at  3.5  pi  per  well.  The  plate  was  incubated  at  37°C  for  5 minutes.  Terminal 
deoxynucleotidyl  transferase  (TdT)  was  added  to  assist  with  the  complete  polymerization 
of  DNA  containing  high  GC  content  or  secondary  structure  (Kho  and  Zarbl,  1992). 
Addition  of  1 pi  of  the  TdT  mix  (6  pi  TdT  termination  buffer,  20.3  pi  dH20,  2.3  pi  TdT 
(30  U),  and  1.5  pi  of  20  mM  dNTP)  was  made  to  each  well,  and  samples  were  returned  to 
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a 37°C  water  bath  for  30  minutes.  Reactions  were  halted  with  the  addition  of  4 |j,l  of  the 
stop  solution  (95%  formamide,  20  mM  EDTA,  0.05%  bromophenol  blue,  and  0.05% 
xylene  cyanol). 

The  urea-acrylamide  gel  was  poured  using  freshly  dissolved  7 M urea,  0.6X  TBE 
buffer,  and  polyacrylamide  solution  (6%  w/v).  Polymerization  was  induced  after  the 
addition  of  65  pi  of  TEMED  and  650  pi  of  freshly  prepared  1 0%  ammonium  persulfate. 
After  polymerizing,  the  gel  was  set  into  the  electrophoresis  unit,  and  the  temperature  of 
the  gel  was  brought  to  50°C  by  passing  a current  through  the  gel.  The  samples  were 
placed  onto  a 90°C  heating  block  for  3 minutes  and  returned  to  ice  immediately  before 
being  loaded  into  the  lanes  of  the  gel.  Electrophoresis  proceeded  for  2 hours  in  1 X TBE. 
The  gel  was  then  removed  from  the  plates  with  a piece  of  Whatman  film,  and  plastic 
wrap  was  layered  over  the  other  side  of  the  gel  before  setting  the  covered  gel  into  a 
Model  583  Gel  Dryer  for  1-2  hours  (Bio-Rad).  Autoradiography  film  was  placed  onto 
the  gel  and  allowed  to  expose  for  1-3  days  before  XOMat  processing. 

When  submitted  to  the  Sequencing  Core  for  analysis,  the  samples  were  prepared 
from  mini-prep  DNA  by  either  the  Promega  Mini-Prep  or  Qiagen  Mini-Prep  kit.  The 
DNA  thus  purified  would  be  analyzed  for  contaminants  by  agarose  gel  electrophoresis. 
The  samples  were  concentrated  by  speed  vacuum  evaporation  of  the  water  and  pelleted. 
DNA  were  reconstituted  in  15  pi  of  sterile  water.  Submission  of  the  DNA  to  the 
University  of  Florida  DNA  Sequencing  Core  was  of  sufficient  quality  and  quantity  for 
the  sequencing  reactions  to  be  performed  using  synthetic  primers  to  the  5'  sense  strand 
and  the  5'  strand  of  the  complementary  strand.  The  sequences  were  generated  by  the  ABI 
Prism  Dye  Terminator  cycle  sequencing  protocols  developed  by  Applied  Biosystems 


39 


(Perkin-Elmer).  The  fluorescently  labeled  extension  products  were  analyzed  on  an 
Applied  Biosystems  Model  373  Stretch  DNA  Sequencer  or  377  DNA  Sequencer  (Perkin- 
Elmer). 

Protein  Expression 

Purification  of  the  recombinant  inhibitor  involved  the  overexpression  of  the  gene 
product  from  E.  coli  BL21[DE3]  cells  containing  the  pET-3  d/pi-3  vector.  Overnight 
cultures  of  cells  were  grown  in  LB  or  minimal  media.  All  components  of  the  minimal 
media  or  LB  media  were  autoclaved  or  sterilized  through  a 0.45  pm  filter  (Nalgene).  A 
lOX  solution  of  M9  salts  (0.5  M Na2HP04,  0.22  M KH2PO4,  0.1  M NaCl,  0.2  M NH4CI) 
was  prepared  and  filter  sterilized.  A 2%  inoculation  of  the  cells  were  made  to  freshly 
prepared  minimal  media  (0.5%  casamino  acids,  IX  M9  salts,  1 mM  MgS04, 1 mM 
CaCl2,  1 mg/ml  Thiamine-HCl,  0.2%  glucose,  and  50  pg/ml  ampicillin)  and  were  grown 

to  ODeoo  of  0.6  in  a 37°C  incubator  rotating  at  300  rpm.  Protein  expression  was  induced 
with  0.45  mM  IPTG.  Three  hours  following  IPTG  addition,  the  cells  were  harvested  by 
centrifugation  at  8000  rpm  for  15  minutes  and  resuspended  into  4.2  ml  of  TN  buffer  (50 
mM  Tris  pH  7.4, 150  mM  NaCl,  1 mM  MgC^)  for  every  gram  of  wet  weight  of  cells.  At 
each  hour  after  induction,  one  milliliter  of  cells  was  removed  from  the  cultures  and 
centrifuged  at  1 2,000  rpm  for  one  minute.  The  supernatant  was  discarded,  and  the  pellet 
was  resuspended  in  100  pi  of  TE  buffer  (pH  8.0)  and  30  pi  of  Laemmli  sample  buffer 
(Laemmli,  1970).  These  cells  were  passed  through  a 27-gauge  needle  three  times  to  lyse 
the  cells.  Samples  were  boiled  for  3-5  minutes  and  loaded  into  the,  wells  of  a 12%  Tris- 
tricine  gel. 
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Selenomethionine  incorporation.  The  protein  rPI-3  was  expressed  containing  the 
methionine-analog  selenomethionine  for  the  property  of  an  internal  heavy  atom  form  of 
the  protein  for  crystallographic  studies.  The  pET-3d/pi3  vector  was  transformed  into  an 
E.coli  strain  B834[DE3],  which  is  a methionine  auxotroph.  The  expression  is  similar  to 
the  basic  protocol  above  with  the  following  modifications.  An  overnight  culture  of 
bacteria  was  grown  in  LB  media,  and  a 4%  inoculation  was  made  to  a freshly  prepared 
modified  minimal  media.  The  modified  media  was  prepared  with  no  casamino  acids, 
twice  the  concentration  of  glucose  (0.4%),  and  1 00  mg/L  D,L-selenomethionine  (Sigma), 
and  all  other  components  were  present  at  the  same  concentrations  as  shown  above.  The 
procedure  was  the  same  for  the  expression.  The  purification  was  performed  as  described 
below;  however,  an  addition  of  2 mM  dithiothreitol  was  made  to  all  the  buffers  used  in 
dialysis  and  chromatography. 

Inclusion  Body  Isolation 

The  suspended  cells  were  treated  with  80  units  of  deoxyribonuclease  I (Sigma,  60 
U/pl)  per  milliliter  of  TN  buffer  added.  Cells  were  lysed  by  twice  passing  through  a 
SLM-Aminco  French  pressure  cell  at  1,000  psi.  Inclusion  bodies  were  recovered  by 
gently  layering  the  cell  lysate  over  10  ml  of  27%  sucrose  in  30-ml  Corex  tubes.  Tubes 
were  centrifuged  for  45  minutes  in  a JS-13.1  swinging  bucket  rotor  in  a J2-21  Beckman 
centrifuge  at  8000  rpm.  The  supernatants  were  aspirated  off  of  the  pellet.  Pellets  were 
suspended  into  5 ml  of  TN-Triton  X-100  (1%)  buffer  and  again  layered  over  27% 
sucrose.  Another  centrifugation  step  was  used  to  pellet  the  inclusion  bodies,  and  the 
supernatants  were  discarded.  Pellets  were  resuspended  in  a TE  buffer  (10  mM  Tris  pH 
7.4,  0.1  mM  EDTA)  at  50  mg/ml.  Samples  were  added  to  the  denaturing  solution  at  an 
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approximate  concentration  of  1 mg/ml  wet  weight  of  pellet.  The  denaturing  solution  was 
composed  of  8 M ultrapure  urea  from  Gibco-BRL,  that  was  freshly  prepared,  mixed  with 
Dowex  ion  exchange  resin  and  filtered,  300  mM  (3-mercaptoethanol,  50  mM  CAPS 
buffer  pH  10.5,  1 mM  EDTA,  1 mM  glycine,  and  0.5  M NaCl.  The  inclusion  bodies 
were  stirred  at  room  temperature  for  one  hour  before  being  loaded  into  Spectra-Por  1 
(6,000-8,000)  dialysis  membranes  (Spectrum).  The  suspension  of  denatured  protein  was 
dialyzed  at  room  temperature  with  50  mM  Tris  pH  1 1 . After  one  hour,  the  buffer  was 
exchanged  with  fresh  50  mM  Tris  pH  11.  After  another  hour,  the  buffer  was  changed  to 
50  mM  Tris  pH  7.5,  and  cylinders  were  moved  to  the  5°C  cold  room.  Dialysis  was 
allowed  to  proceed  for  five  hours  before  the  next  buffer  change  to  20  mM  MOPS  pH  7.0 
with  150  mM  NaCl.  The  buffer  was  maintained  for  10-12  hours  before  a final  exchange 
with  the  same  MOPS/NaCl  buffer  for  an  additional  6 hours. 

Ammonium  Sulfate  Precipitation 

An  ammonium  sulfate  precipitation  was  performed  on  the  post-dialysate.  The 
(NH4)2S04  was  slowly  added  to  the  post-dialysate  with  stirring  on  ice  to  40%  salt 
saturation.  Following  the  complete  addition  of  salt,  the  solution  was  stirred  for  another 
hour.  The  solution  was  centrifuged  in  a JA-14  rotor  at  12,500  rpm  for  40  minutes.  The 
centrifugation  supernatant  was  returned  to  the  ice  and  stirred.  (NH4)2S04  was  slowly 
added  to  achieve  70%  saturation.  As  before,  following  the  complete  addition  of  salt  the 
solution  was  stirred  for  another  hour  before  being  centrifuged.  This  second  pellet  at  70% 
(NH4)2S04  saturation  was  saved  and  predominantly  contained  the  rPl-3 . 
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Gel  Filtration  Chromatography 

This  70%  pellet  was  resolubilized  in  100  mM  pH  5.0  sodium  phosphate  buffer  and 
centrifuged  to  remove  any  insoluble  matter.  The  soluble  protein  was  loaded  onto  a Hi- 
Load  16/60  Superdex  75  (Amersham  Pharmacia)  gel  filtration  column  that  had  been  pre- 
equilibrated in  the  same  buffer.  The  FPLC  system  (Amersham  Pharmacia)  driven  by  a 
LCC-500  Plus  controller  was  used  to  automate  the  injection  of  sample  onto  the  column 
and  to  collect  the  eluting  fractions. 

Protein  Analysis 

Samples  from  fractions  in  and  near  the  peaks  were  analyzed  by  SDS-PAGE  on  a 
12%  Tris-Tricine  discontinuous  mini-gel  (Bio-Rad).  To  detect  any  impurities  in  the 
proteins,  wells  were  completely  loaded  from  40  pi  of  the  fractions  combined  with  1 0 pi 
of  the  5x  LSB.  Samples  were  boiled  for  3-5  minutes  before  being  loaded  into  the  wells. 
Running  buffers  were  0.1  M Tris  pH  8.25,  0.2  M Tricine,  0.1%  SDS  as  cathode  buffer 
and  a 0.2  M Tris  pH  8.9  as  the  anode  buffer.  Gels  were  washed  in  a Coomassie  stain  for 
at  least  thirty  minutes  and  destained  in  a 1 0%  glacial  acetic  acid  / 5%  ethanol  solution. 
Additionally,  all  samples  taken  during  the  expression  and  purification  of  the  protein  were 
loaded  onto  SDS-polyacrylamide  gels  for  visualization.  Samples  were  also  measured  for 
function  in  an  assay  with  pepsin,  as  described  below.  Column  fractions  that  indicated  a 
homogeneous  protein  content,  from  gel  filtration  chromatography  and  SDS-PAGE 
analysis,  and  functional  inhibition  of  pepsin  were  pooled  together.  Proteins  purified  from 
the  column  were  measured  by  the  absorbance  value  at  280  nm  for  an  estimate  of  protein 
concentration.  Based  on  the  Ai8o  values,  between  1 .5-3.0  nmoles  of  protein  were 
submitted  to  the  UF-ICBR  Protein  Chemistry  Core  for  hydrolysis  and  amino  acid 
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analysis  to  compute  a more  accurate  protein  concentration  for  the  inhibition  assays  and 
structural  studies. 

Spectrophotometric  Assays 

Kinetic  assay.  Porcine  pepsin  was  incubated  for  3 minutes  at  37°C  in  pH  3.5 
sodium  formate  at  0. 1 M concentration.  Reactions  were  initiated  upon  the  addition  of 
different  concentrations  of  the  chromogenic  substrate  K-P-I-E-F*Nph-R-L.  The  assay 
measured  the  hydrolysis  of  the  substrate  between  the  Phe*Nph  (nitrophenylalanine) 
peptide  bond,  which  was  monitored  by  the  decrease  in  the  average  absorbance  from  284- 
324  run  on  a Hewlett-Packard  8452A  diode-array  spectrophotometer  equipped  with  a 7- 
cell  multitransport  (Dunn  et  al.,  1994,  Scarborough  and  Dunn,  1994).  Initial  rate  data 
were  collected  for  at  least  six  different  substrate  concentrations  to  determine  the  Vmax  and 
Km  of  the  enzyme  for  this  substrate  using  a Michaelis-Menten  fit  to  equation  1 using  a 
Marquardt  algorithm  (Marquardt,  1963) 

Eq.  1 V = Vmax  [S]  / (Km  + [S]). 

Other  enzymes  were  measured  for  the  basic  kinetic  properties  in  a similar  manner  as 
described  above;  however,  the  time  of  pre-incubation  and  the  buffer  differed  among  the 
enzymes.  Human  cathepsin  D required  a 3 minute  incubation  and  pH  3.8.  Recombinant 
human  cathepsin  E required  a 3 minute  incubation  and  a pH  4.4  buffer.  The  recombinant 
plasmepsin  II  required  a 5 minute  incubation  and  a pH  4.4  buffer. 

Inhibition  assay.  Proteolytic  enzyme  inhibition  was  measured  as  the  decrease  in 
cleavage  of  the  chromogenic  peptide  substrate  with  the  addition  of  the  inhibitor. 
Following  the  initial  enzyme  kinetic  analyses,  at  least  2 different  concentrations  of  the 
inhibitor  were  incubated  vvith  the  enzyme  for  5 minutes  under  the  same  conditions  as 
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before.  For  each  inhibitor  concentration,  five  different  substrate  concentrations  were 
used  to  initiate  the  reactions.  Initial  rate  data  were  then  collected,  and  competitive 
dissociation  values,  Ki,  were  determined  by  equation  2,  fit  with  a Marquardt  analysis 
Eq.  2 V = Vn,ax  [S]  / [Km  (1  + [I]  / Ki)  + [S]]. 

Tight-binding  inhibition.  Many  of  the  inhibitory  reactions  measured  indicated  that 
the  affinity  of  the  inhibitor  for  the  protease  was  in  the  same  order  of  magnitude  as  the 
concentration  of  the  protease  in  the  assay.  For  these  cases,  the  measure  of  binding 
affinity  was  determined  by  a different  method  for  tight-binding  inhibitors  (Henderson, 
1972).  The  assay  began  with  the  five  minute  incubation  at  37°C  of  a constant  amount  of 
the  enzyme,  3-10  nM,  in  the  appropriate  buffer  (at  0.1  M),  and  with  the  inhibitor  at 
varying  concentrations  from  0.1-500  nM.  The  reactions  were  initiated  with  the  addition 
of  equal  concentrations  of  the  substrate  K-P-I-E-F*Nph-R-L  and  loaded  into  the  cuvettes 
in  the  sample  transporter.  Concentrations  of  the  substrate  used  were  near  the  Km  of  the 
substrate  for  the  specific  enzyme.  Measurements  were  made  as  before,  and  initial  rate 
data  were  collected.  The  analysis  of  a series  of  these  data  were  made  by  equation  3 
(Henderson,  1972)  using  the  Enzfitter  program  (Leatherbarrow,  1987) 

Eq.  3 [It]  / (1  - (vi/vo))  = [Et]  + Ki  ([S]  + Km)  / Km  * (vo/vj). 

Denaturation  Curve  Analysis 

Fluorescence  spectrometry.  Fluorescence  spectroscopy  was  used  to  measure  the 
change  in  fluorescence  intensity  of  the  nascent  tryptophan  residue  in  the  rPI-3.  These 
measurements  were  conducted  on  a SLM-Aminco  4800C  Spectrofluorometer.  All 
measurements  were  performed  at  the  ambient  temperature  of  25°C.  For  the  tryptophan 
fluorescence,  a wavelength  of  280  run  was  used  to  excite  the  sample,  and  emission 
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spectra  were  collected  from  300-400  nm,  with  maximal  fluorescence  detected  near  360 
nm. 

Reversibility  of  denaturation.  The  samples  were  analyzed  to  ensure  that  the 
measured  fluorescence  shifts  were  reversible,  signifying  a structural  equilibrium  between 
folded  and  unfolded  states.  The  protein  samples  were  diluted  to  0. 16  mM  concentration 
in  0.1  M sodium  phosphate  buffer  pH  5.0.  Three  different  preparations  were  made  from 
this  initial  dilution.  Two  samples  were  diluted  in  a 7.2  M concentration  solution  of  urea, 
and  a third  sample  was  diluted  into  the  phosphate  buffer.  After  one  hours’  time  to 
equilibrate,  the  samples  were  diluted  ten-fold  (final  rPI-3  concentration  brought  to  5.4 
pM)  into  7.2  M urea,  phosphate  buffer,  and  0.72  M urea,  respectively.  These  were 
permitted  to  re-equilibrate  for  30  minutes  before  being  analyzed  on  the 
spectrofluorometer.  The  analysis  was  repeated  with  the  protein  in  the  presence  of  35  mM 
p-mercaptoethanol.  Immediately  after  collecting  the  data,  the  samples  were  analyzed  in  a 
functional  assay  with  pepsin  to  be  confident  that  the  protein  had  refolded  into  am  active 
conformation.  5 pi  of  the  samples  were  incubated  with  pepsin  in  a 0.1  M sodium  formate 
buffer  at  pH  3.5  for  5 minutes  before  the  addition  of  the  substrate  in  the  reaction  as 
previously  described. 

Urea  denaturation.  Stock  urea  solutions  were  analytically  prepared  by  weighing  the 
sodium  phosphate  buffer  pH  5.0  to  the  weighed  ultrapure  urea.  The  molar  concentration 
of  urea  (Murea)  was  precisely  determined  by  two  methods  (Pace,  1 997).  The  first  method 
was  taken  from  this  analytical  preparation  of  the  urea  in  the  phosphate  buffer  calculated 
by  equation  4 

Eq.  4 Murea  ~ maSSurea  ! M^Vurea  ! (maSSurea+buffer  /d). 
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where  d is  the  density  of  urea  in  water  given  by  the  relationship:  [1  + 0.2658  * (massurea  / 
mas Surea  + buffer)  + 0.0330  * (masSurea  / masSurea  + buffer ) ]•  For  the  second  method,  the 
refractive  index  of  the  solution  was  measured  on  a Fisher  refractometer  and  subtracted 
from  the  refractive  index  of  the  buffer.  The  determination  of  the  concentration  required 
that  difference  in  refractive  indexes  (AN)  to  be  applied  to  equation  5, 

Eq.  5 Murea  = (1 17.66  * AN)  + (29.753  * (AN)^)  + (185.56  * (AN)^). 

Samples  of  the  urea  were  only  used  when  the  difference  in  the  values  of  these  determined 
concentrations  was  less  than  1%. 

Stocks  of  the  wild-type  rPI-3  were  diluted  to  1.5  pM  in  0.1  M sodium  phosphate 
buffer  pH  5.0  and  the  freshly  prepared  10.0  M urea  solution.  These  solutions  containing 
between  0 and  9.3  M urea  were  permitted  to  equilibrate  for  an  hour  before  measurements 
were  made  with  the  spectrofluorometer.  Samples  were  measured  as  described  above,  in  a 
1 -cm  quartz  cuvette. 

The  same  experiment  was  performed  with  the  addition  of  P-mercaptoethanol. 

These  solutions  were  prepared  in  the  same  way  as  before;  however,  the  presence  of  35 
mM  P-mercaptoethanol  was  included  in  every  sample.  The  amount  of  P-mercaptoethanol 
was  chosen  based  on  a concentration  that  did  not  interfere  with  the  intensity  of  the 
fluorescence  emission. 

Emission  spectral  data  were  compared  at  350  run.  The  fluorescence  intensity  values 
were  plotted  versus  the  urea  concentration  showing  a two-state  transition  from  native  to 
the  denatured  state.  The  regions  of  the  curve  corresponding  to  the  folded  state  of  the 
protein  and  the  unfolded  protein  were  fit  by  a linear  regression  analysis.  The  two  lines 
were  extrapolated  to  predict  the  fluorescence  intensity  for  the  native  or  denatured  states 
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(Fn  or  Fd).  An  equilibrium  constant,  Kdn,  and  free  energy  of  unfolding,  AGdn,  for  the 
transition  data  were  determined  by  equation  6 

Eq.  6 Kdn  = (Fn-F)/(F-Fd)  = exp(-AGdn  / R T), 
where  R is  the  gas  constant  1.987cal  deg’*  mol-1  and  T is  the  absolute  temperature.  The 
terms  for  the  free  energy  of  unfolding  under  denaturant  conditions  have  been  shown  to  be 
linearly  related  to  the  denaturant  concentration  and  can  be  extrapolated  to  the  state  with 
no  denaturants,  and  is  defined  as  the  AG(H20)  (Pace,  1 986).  The  dependence  of  AG  on 

urea,  m,  is  defined  by  (AG(H20)  - AG)  / Murea-  The  term  for  the  midpoint  of  urea 

denaturation,  [ureai/2],  is  used  to  describe  the  relative  stability  of  the  protein,  as  is  the  free 
energy  of  the  protein  unfolded  state  in  water. 

Eq.  7 [ureai/2]  = AG(H20)  / m. 

Circular  Dichroism 

Circular  dichroism  spectropolarimetry  was  performed  to  determine  the  structural 
similarity  among  the  mutated  forms  of  the  rPI-3  protein.  The  data  were  collected  using  a 
JASCO  J-500c  CD  Spectropolarimeter,  coupled  with  an  IF-500II  data  converter  to 
translate  the  signal  onto  a 286  PC  hard  drive.  The  data  could  be  analyzed  using  either  the 
J-600  software  package  from  JASCO  or  imported  into  SigmaPlot  for  unit  conversion 
from  elliptical  units  to  molar  ellipticity.  Each  sample  of  purified  protein  was  diluted  to 
0.01-0.1  mg/ml  concentration  and  delivered  into  either  a 1-cm  or  1-mm  quartz  cuvette. 
Data  were  collected  over  the  wavelength  range  from  260- 1 95  nm  with  a time  constant  of 
2.0  seconds,  band  width  of  1 .0  nm,  and  20  nm  scanned  per  minute.  These  data  were 
collected  in  units  of  ellipticity  and  could  be  converted  to  molar  ellipticity  by  equation  8 


Eq.  8 0 mrw  = 0 / 10  * c * 1. 
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The  values  for  0 were  obtained  from  the  average  of  three  scans.  The  term  c corresponded 
to  the  protein  concentration  in  mg/ml,  and  1 is  the  cell  pathlength  in  centimeters. 

Mass  Spectrometry 

Purified  protein  were  diluted  into  water  to  a concentration  between  1-100  pM. 
Either  of  two  matrixes  designed  for  the  sample  ionization  in  the  mass  spectrometer, 
sinapinic  acid  (3,5-dimethoxy-4-hydroxy  cinnamic  acid)  and  a-cyano-4-hydroxy 
cinnamic  acid,  were  dissolved  in  a 10%/70%  formic  acid/acetonitrile  solution  to  10 
mg/ml.  The  protein  and  matrix  were  mixed  in  an  equal  volume  ratio,  and  1 pi  of  the 
mixture  was  placed  onto  a well  on  the  loading  tray.  Multiple  concentrations  of  the 
samples  were  prepared  in  this  manner.  The  samples  were  dried  on  the  plate,  and  the  plate 
was  inserted  into  the  PerSeptive  Biosystems  Voyager  MADLI-TOF  mass  spectrometer. 
The  laser  was  fired  onto  the  sample  plate  at  a well  containing  the  sample.  The  intensity 
of  the  laser  was  adjusted  to  maximize  the  peak  size  of  components  in  the  mixture  and  to 
increase  the  signal-to-noise  ratio. 

Protein  Digest  and  Analysis 

The  rPI-3  protein  was  analyzed  for  the  apparent  sensitivity  to  other  endopeptidases 
as  agents  to  fragment  the  inhibitor  for  disulfide  bond  pair  assignments.  The  protein  is 
composed  of  1 0 lysine  residues  and  3 arginine  residues,  at  least  one  basic  residue 
between  all  six  of  the  cysteine  residues.  Digestion  of  rPI-3  with  trypsin  or 
endoproteinase  Lys-C  was  performed  with  100  pi  of  0.28  mM  rPI-3,  6.5  pi  enzyme 
(6.5pg),  and  either  ammonium  bicarbonate  buffer  pH  7.8  or  Tris-HCl  pH  7.7  with  EDTA, 
respectively.  The  proteolytic  hydrolyses  were  permitted  to  last  overnight,  and  at  time 
points  of  0,  1,  2,  4,  8,  and  24  hours,  20  pi  were  removed.  These  samples  were  mixed 
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with  5 |j1  of  5X  LSB,  were  boiled,  and  loaded  onto  a 12%  Tris-tricine  gel.  The  gels  were 
electrophoresed  and  analyzed  after  staining  in  Coommassie  dye,  and  destaining  in  acetic 
acid/ethanol.  From  the  24-hour  time  point,  samples  were  inactivated  by  acidification 
with  0.2  M sodium  formate  pH  3.8.  These  samples  were  diluted  and  mixed  with  the  a- 
cyano-4-hydroxy  cinnamic  acid  matrix  for  MALDI  mass  spectrometry  analysis.  The 
reactions  of  rPI-3  with  the  enzymes  trypsin  and  endoproteinase  Lys-C  were  performed  in 
the  presence  of  20  mM  p-mercaptoethanol  in  the  same  manner  as  described. 


CHAPTER  3 

DEVELOPMENT  OF  THE  RECOMBINANT 
EXPRESSION  AND  PURIFICATION  SYSTEM 

Introduction 

Robert  Peanasky  and  Ghaleb  Abu-Erreish  purified  the  native  pepsin  inhibitor  PI-3 
as  part  of  a project  to  purify  different  classes  of  proteinase  inhibitors  from  the  Ascaris 
lumbricoides  parasitic  nematode.  Proteins  were  isolated  from  worms  that  were  collected 
at  abattoirs.  Rather  large  numbers  of  the  worms  were  necessary  for  the  isolation  of  this 
protein  because  of  the  low  expression  of  this  pepsin  inhibitor  (Martzen  et  al.,  1990). 
Additional  attempts  to  study  PI-3  were  made  while  culturing  the  nematodes  in  the 
laboratory.  These  culturing  techniques  were  difficult  on  the  staff  technicians,  who  would 
slowly  develop  an  anaphylactic  response  to  antigens  from  the  worms.  The  team  was 
eventually  successful  in  purifying  enough  of  this  protein  to  determine  the  primary  amino 
acid  sequence  of  PI-3  and  the  pairing  of  disulfide  bonds  formed  by  the  six  cysteine 
residues  in  the  protein. 

Due  to  the  difficulty  with  isolating  and  purifying  the  native  protein,  developing  a 
recombinant  protein  expression  system  was  necessary  for  extensive  characterization  of 
PI-3.  The  determination  of  the  primary  sequence  made  it  possible  to  isolate  the  gene 
encoding  PI-3  from  a cDNA  library  of  Ascaris  suum.  Engineering  the  piS  gene  into  a 
plasmid  would  allow  the  long-term  storage  of  the  gene  in  a transformed  bacterial  cell  line 
and  permit  modifications  to  the  protein  by  genetic  manipulation.  The  expression  system 
had  to  allow  for  ease  in  protein  purification  and  protein  yields  sufficient  for  analyses  in 
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this  laboratory  and  for  structural  characterization  of  the  protein  by  a collaborating  team. 
Within  this  chapter,  I will  describe  the  development  of  the  current  expression  system 
used  and  the  purification  scheme  for  the  recombinant  pepsin  inhibitor,  rPI-3. 

Results 

Protein  Expression 

Initially  the  pi3  gene  was  isolated  from  an  Ascaris  cDNA  library  and  ligated  into  a 
pGEM-T  cloning  vector.  This  cloned  gene  was  then  transformed  into  two  expression 
systems  that  were  designed  to  add  a leader  sequence  onto  the  amino-terminal  end  of  the 
protein.  The  additional  protein  sequence  should  act  as  a shuttle  to  direct  the  protein 
outside  of  the  cytoplasm.  For  the  pET-22b(+)  expression  vector,  the  pelB  leader  directs 
the  protein  to  the  periplasm  of  the  E.  coli  host  strain  BL21  [DE3].  In  the  case  of  the 
pHIL-Sl  Pichia  pastor  is  expression  system,  the  yeast  leader  sequence  PHOl  directs  the 
secretion  of  the  protein  to  the  growth  media.  These  two  systems  were  chosen  for  the 
purposes  of  a simple  purification  procedure  and  assistance  with  correct  protein  folding 
and  disulfide  bond  pairing  by  targeting  the  expressed  protein  outside  of  the  chemically 
reducing  conditions  of  the  cytoplasm. 

These  two  secretion  systems  were  also  chosen  for  different  reasons.  The  Pichia 
expression  system  was  chosen  because  of  an  uncertainty  about  whether  or  not  the  protein 
required  post-translational  modifications.  This  system  was  also  chosen  for  the  possibility 
that  the  protein  secreted  into  the  growth  media  might  be  simpler  to  purify  than  from  using 
a bacterial  expression  system.  The  E.  coli  expression  system  was  chosen  for  the 
possibility  that  the  protein  did  not  require  post-translational  modifications  and  that  the 
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yeast  expression  system  produced  insufficient  quantities  of  the  protein  for  extensive 
studies. 

The  level  of  rPI-3  expression  in  the  yeast  and  bacterial  systems  was  less  than 
expected.  The  Pichia  system  produced  higher  yields  of  the  protein  than  were  obtained 
from  an  equal  volume  of  bacterial  cultures.  The  protein  was  purified  from  both  systems; 
however,  yields  in  initial  experiments  were  too  low  to  qualitatively  assess  the  purity  of 
the  protein  by  Coommassie-stained  protein  gels.  An  antibody  to  the  inhibitor  was 
required  to  judge  the  presence  of  the  protein  using  a Western  blot  procedure  from  the 
SDS  polyacrylamide  gel.  The  purity  of  the  protein  could  not  be  assessed  using  such  a 
selective  tool.  Even  after  taking  steps  to  improve  the  yield  of  the  protein  from  the 
periplasm  of  the  BL21  [DE3]  bacterial  expression  host,  this  system  was  still  yielding  less 
than  0. 1 mg  of  protein  for  every  liter  of  growth  media  supplied.  The  proteins  isolated 
from  both  expression  systems  were  analyzed  in  a functional  assay  and  were  potent 
inhibitors  of  pepsin.  Both  expression  systems  appeared  to  generate  protein  that  was  fully 
functional  for  inhibiting  the  enzyme  pepsin;  however,  the  expressed  protein  yields  were 
still  rather  low. 

The  gene  was  therefore  sub-cloned  into  another  bacterial  expression  system,  pET- 
3d.  During  the  sub-cloning  process,  6 nucleotides  were  added  onto  the  5’  terminus  of  the 
gene  encoding  a methionine  and  a threonine  for  the  ease  of  cloning  into  the  plasmid  and 
for  a translational  start  signal  (Figure  3-1).  This  plasmid  and  other  pET  vectors  have 
been  used  previously  for  the  expression  of  aspartic  proteinases  in  this  laboratory 
(Lowther,  1994,  Westling,  1998,  Bhatt,  1998).  With  protein  expression  under  the  control 
of  the  T7  polymerase,  the  E.  coli  machinery  has  successfully  translated  large  quantities  of 
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1/1 

ATG  ACT  CAG  TTC  CTG  TTC  TCC  ATG  TCT  ACC  GGT  CCG  TTC  ATC  TGC 

mtq'flfsmstgpfic 

46/16 

ACC  GTT  AAG  GAT  AAT  CAA  GTG  TTT  GTG  GCA  AAT  TTG  CCT  TGG  ACG 
TVKDNQVFVANLPWT 

91/31 

ATG  TTG  GAA  GGA  GAT  GAT  ATT  CAA  GTG  GGT  AAG  GAA  TTC  GCC  GCT 
MLEGDDIQVGKEFAA 

136/46 

AGA  GTT  GAA  GAT  TGC  ACA  AAT  GTG  AAA  CAC  GAT  ATG  GCA  CCA  ACA 
RVEDCTNVKHDMAPT 

181/61 

TGT  ACG  AAA  CCA  CCA  CCA  TTC  TGT  GGT  CCA  CAA  GAT  ATG  AAG  ATG 
CTKPPPFCGPQDMKM 

226/76 

TTC  AAT  TTC  GTT  GGC  TGC  TCA  GTT  TTG  GGC  AAT  AAA  TTA  TTC  ATA 
FNFVGCSVLGNKLFI 

271/91 

GAT  CAG  AAA  TAT  GTT  CGT  GAC  CTC  ACA  GCT  AAA  GAT  CAT  GCT  GAA 
DQKYVRDLTAKDHAE 

316/106 

GTG  CAA  ACG  TTC  AGA  GAA  AAA  ATA  GCC  GCC  TTC  GAA  GAG  CAG  CAA 
VQTFREKIAAFEEQQ 

364/121 

GAA  AAT  CAA  CCA  CCT  TCA  TCT  GGA  ATG  CCA  CAC  GGA  GCT  GTT  CCC 
ENQPPSSGMPHGAVP 

409/136 

GCA  GGT  GGG  CTA  TCA  CCT  CCT  CCA  CCG  CCG  AGT  TTT  TGC  ACT  GTA 
AGGLSPPPPPSFCTV 

451/151 
CAA  taa  taa 

Q 

Figure  3-1.  DNA  Sequence  of  Pi-3  in  pET-3d  and  Codon  Translation 

DNA  bases  are  listed  in  codon  triplets  with  the  single-letter  amino  acid  designation  for 
the  translation  products  listed  below.  Sequence  includes  a two  codon  extension  at  the  5' 
end  encoding  a methionine  and  threonine. 
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proteinases.  The  expression  of  the  protein  within  the  cell  could  lead  to  proper  folding  if 
the  cell  has  ample  amounts  of  heat  shock  proteins  to  aid  the  folding  process.  However, 
due  to  the  overexpression  of  the  protein  by  the  T7  polymerase,  the  cellular  chaperones 
might  be  overwhelmed  by  the  amount  of  protein  being  produced,  resulting  in  protein 
misfolding  and  aggregation  in  the  bacterial  cytoplasm  in  the  form  of  inclusion  bodies 
(Rudolph  and  Lilie,  1 996).  This  insoluble  protein  could  be  resolubilized  in  urea  and 
slowly  refolded  by  diluting  or  dialyzing  the  urea  away  to  yield  properly  folded  protein. 
The  formation  of  the  insoluble  matter  also  reduces  the  possibility  of  proteolytic  digestion 
by  host  cell  peptidases. 

Purification  of  the  pET-22b(+')  Periplasmic  Protein 

Cells  were  harvested  after  three  hours  of  growth  following  the  induction  of 
expression  by  IPTG.  The  bacterial  outer  membranes  were  burst  by  osmotic  shock, 
leaving  the  irmer  membrane  and  cytoplasm  intact.  Anion  exchange  chromatography  was 
performed  on  the  periplasmic  suspension  to  isolate  the  rPI-3  from  bacterial  protein 
samples.  The  chromatogram  showed  the  elution  of  the  protein  in  a peak  following  the 
increase  in  the  concentration  of  NaCl  to  0.15  M in  the  20  mM  MOPS  buffer  pH  7.0 
(Figure  3-2).  Analysis  of  the  chromatographic  peaks  by  SDS-PAGE  indicated  the  protein 
had  not  been  completely  purified.  The  recombinant  PI-3  was  also  shown  to  be  an  active 
inhibitor  of  pepsin  (Figure  3-3).  The  yields  for  the  purification  of  this  protein  were,  at 
best,  0.05  mg  protein  for  one  liter  of  LB  growth  media.  While  the  purification  procedure 
was  simple,  the  amount  of  protein  obtained  was  far  below  the  expected  yields  to  make 
this  system  a useful  expression  system.  The  yield  was  lower  than  the  yield  of  protein 
from  the  Pichia  expression  system  by  nearly  1 0-fold.  Comparable  effects  of  inhibition 
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Figure  3-2.  Anion  Exchange  Chromatography  of  the  rPI-3  from  pET-22b(+) 

The  chromatogram  indicates  the  volume  at  which  the  different  components  elute  from  the 
Hi-Q  anion  exchange  column.  Absorbance  values  are  shown  for  the  samples  at  280  nm 
(solid  line),  and  the  concentration  of  NaCl  addition  to  the  20  mM  MOPS  pH  7.0  is  shown 
as  the  dashed  line.  Two  peaks  at  55  ml  and  65  ml  (A  and  B)  contained  active  PI-3,  when 
the  NaCl  concentration  had  just  been  raised  to  1 50  mM. 
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Figure  3-3.  Activity  Assay  for  the  Function  of  Recombinant  PI-3 

Protein  purified  from  the  Hi-Q  anion  exchange  column  was  assayed  for  inhibition  of 
pepsin.  Pepsin  activity  is  observed  as  the  decrease  in  the  average  absorbance  over  time. 
The  straight  lines  represent  tangents  drawn  to  the  initial  velocity,  -AA/time.  Initial  rate 
data  for  substrate  hydrolysis  represent  pepsin  interactions  with  FPLC  fractions  from 
peaks  A,  B,  C,  D,  and  E (Figure  3-2)  and  a pepsin  control  numbered  1-6,  respectively. 
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between  the  two  proteins  with  pepsin  were  determined;  therefore,  post-translational 
modifications  to  the  protein  are  not  required  for  Pl-3  function. 

Purification  of  rPI-3  from  the  pET-3d  System 

Recombinant  PI-3  was  expressed  for  three  hours  after  IPTG  addition,  and  typically 
3-4  grams  of  cells  were  harvested  for  each  liter  of  media.  From  the  SDS  polyacrylamide 
gel  (Figure  3-4),  expression  of  the  protein  could  be  detected  even  after  one  hour  of 
induction.  The  cells  were  lysed  by  French  pressure  cell,  and  inclusion  bodies  were 
isolated.  The  rPI-3  protein  was  the  major  constituent  of  the  cellular  aggregates,  though 
other  E.  coli  proteins  appear  to  have  either  formed  part  of  the  inclusion  bodies  or  simply 
sedimented  with  the  inclusion  bodies  (Figure  3-5).  Previous  preparations  of  inclusion 
bodies  of  proteinases  using  an  identical  procedure  typically  produced  a more  pure  sample 
of  the  induced  proteins  than  the  result  for  rPI-3.  Two  non-exclusive  possibilities  may 
account  for  the  greater  heterogeneity  of  the  rPI-3  inclusion  body  preparations.  Other 
bacterial  proteins  may  be  expressed  at  higher  rates,  and  the  number  of  heat  shock  proteins 
may  be  incapable  of  assisting  the  folding  of  all  of  the  E.  coli  proteins.  The  smaller  size  of 
the  protein,  compared  to  proteinases,  may  not  aggregate  as  readily  without  the 
incorporation  of  other  proteins  that  are  overexpressed  and  misfolded,  though  the 
aggregation  of  proteins  does  not  appear  to  correlate  with  size  of  protein  (Wilkinson  and 
Harrison,  1991).  Also,  the  six  cysteine  residues  of  this  protein  may  not  form  complete 
intrcimolecular  disulfide  bonds  due  to  the  reducing  conditions  in  the  cytoplasm.  These 
proteins  might  form  disulfide  bond  pairs  with  other  proteins,  either  with  other  rPI-3 
proteins  or  E.  coli  cytoplasmic  proteins. 
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Figure  3-4.  Protein  Induction  Analyzed  by  SDS-PAGE 

A 12%  SDS-PAGE  gel  was  used  to  show  the  cell  lysates  of  the  pre-induced  (lane  2)  and 
post-induced  cells  at  different  time  points.  The  times  were  0.5  hour  (3),  1 hour  (4),  2 
hours  (5),  and  3 hours  (6)  after  induction  with  0.45  mM  IPTG.  Lane  1 is  a molecular 
weight  marker  with  masses  listed  in  kDa. 
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Figure  3-5.  Gel  of  the  Purification  of  rPI-3 

A 12%  SDS  polyacrylamide  gel  was  used  to  visualize  the  status  of  rPI-3  purification 
through  the  various  stages  of  the  scheme.  Lanes  1-2  show  the  induction  of  rPI-3 
expression  from  pre-IPTG  to  3 hours’  post-induction.  Lanes  3 and  4 show  the 
purification  of  the  inclusion  body  pellets  after  the  1**  sucrose  wash  (3)  and  the  2"‘* 
sucrose  wash  pellet  (4).  The  final  two  lanes  show  the  pellets  of  the  40%  ammonium 
sulfate  precipitation  (5)  and  70  % ammonium  sulfate  precipitation  (6). 
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The  heterogeneous  inclusion  bodies  were  denatured  in  urea  and  P-mercaptoethanol 
at  alkaline  pH.  The  protein  was  diluted  to  1 mg  of  the  wet  weight  of  the  inclusion  bodies 
per  milliliter  of  the  denaturant  to  assist  with  the  unfolding  and  later  refolding  of  the 
protein  into  monomers.  After  refolding  the  protein  by  dialysis,  ammonium  sulfate 
precipitation  was  used  both  to  remove  contaminating  proteins  and  to  concentrate  the 
volume  of  the  sample  being  loaded  onto  a Superdex  75  gel  filtration  chromatography 
column  (Figure  3-6).  The  homogeneous  protein  was  purified  after  eluting  with  70 
milliliters  of  phosphate  buffer  through  the  column  (Figure  3-7).  The  column  had  been 
standardized  previously  with  bovine  serum  albumin,  carbonic  anhydrase,  cytochrome  c, 
and  aprotinin,  and  plotting  the  ratio  of  the  volume  of  protein  elution  to  void  volume  by 
the  log  of  the  molecular  weight  of  the  proteins  resulted  in  a straight  line.  The  rPl-3 
elution  volume  fit  very  closely  to  the  predicted  volume  for  a protein  of  16.7  kDa.  Final 
yields  for  the  purification  of  rPI-3  from  the  pET-3d  expression  system  were  between  3-4 
milligrams  of  protein  for  every  liter  of  starting  media. 


Table  3-1 . Purification  Yield  of  rPI-3  from  pET-3d  Expression  System 


Stage  of  Purification 

Yield  per  1 -Liter 

Harvest  of  BL21[DE3]  cells 

1.5  gm 

Inclusion  body  isolation 

350  mg 

Ammonium  sulfate  precipitation 

4-5  mg 

Gel  filtration  chromatography 

3-4  mg 

Superdex  75  Standards 
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The  major  peak  at  70  milliliters  represents  the  active  rPI-3  from  the  size  exclusion  column.  The  inset  figure  shows  a semi-log  plot  of 
the  protein  molecular  weight  (y-axis)  with  the  ratio  of  the  protein  elution  volume  to  void  volume.  The  standardization  of  the  column 
was  performed  with  bovine  serum  albumin  (66  kDa),  carbonic  anhydrase  (29  kDa),  cytochrome  c (12.4  kDa),  and  aprotinin  (6.5  kDa), 
and  the  alignment  of  rPI-3  (16.7  kDa)  is  shown  in  comparison  to  the  other  proteins. 
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Figure  3-7.  SDS-PAGE  Analysis  of  the  Chromatographic  Purification  of  rPI-3 

This  12%  SDS-PAGE  gel  shows  the  purity  of  the  protein  sample  in  the  fractions  from  the 
Superdex  75  chromatography  gel  of  rPI-3  (Figure  3-6).  Lane  1 is  a standard  marker. 
Lanes  2-8  are  taken  from  the  fractions  of  the  major  peak  from  64  ml  - 77  ml,  at  2-ml 
increments. 
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Mass  Spectrometry 

The  recombinant  PI-3  was  analyzed  by  MALDI-TOF  mass  spectrometry.  The 
protein  passage  through  a SDS  polyacrylamide  gel  had  been  slightly  retarded,  appearing 
nearly  21  kDa  in  size  compared  with  the  protein  markers  (Figure  3-4).  However,  the 
protein  appeared  to  elute  through  a Superdex  75  gel  filtration  column  at  the  predicted 
time  for  a 16.7  kDa  protein.  With  a discrepancy  between  the  two  methods  of  size 
fractionation,  the  greater  precision  of  a mass  spectrum  of  rPI-3  could  be  used  to  assure 
the  protein  was  the  predicted  size  of  16,636  Da.  Sinapinic  acid  was  used  as  the  matrix  to 
assist  the  ionization  of  the  protein.  The  detection  of  rPI-3  showed  the  molecular  mass 
was  16,762  (Figure  3-8).  The  detected  mass  differed  by  less  than  1%  of  the  predicted 
mass  of  rPI-3,  showing  that  the  recombinant  protein  was  the  expected  size.  Another 
minor  peak  was  present  at  nearly  half  the  mass  indicated  by  the  rPI-3  peak,  and  this 
might  be  indicative  of  a double  charge  on  the  rPI-3  [M  + 2H]  . 

Inhibition  of  Aspartic  Proteinases  bv  rPI-3 

The  purified  protein  was  analyzed  for  the  fimctional  inhibition  of  pepsin.  Two 
methods  of  the  assay  were  performed  to  determine  the  affinity  of  rPI-3  to  the  porcine 
pepsin.  The  first  method  involved  collecting  separate  sets  of  initial  rate  data  at  different 
concentrations  of  the  inhibitor  (Figure  3-9).  The  resulting  Ki  value  of  1 .2  nM  was  in 
close  agreement  with  the  values  previously  reported  for  the  inhibition  of  porcine  and 
human  pepsin  by  the  native  inhibitor,  0.5  nM  eind  2.0  nM  (Abu-Erreish  and  Peeinasky, 
1974b).  Another  method  of  determining  the  affinity  of  these  proteins  was  performed 
because  the  determined  Ki  value  was  within  the  same  order  of  magnitude  as  the 
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Figure  3-8.  Mass  Spectrum  of  the  Recombinant  Pepsin  Inhibitor,  rPI-3 

2 I 

The  major  peak  is  at  the  expected  molecular  mass  of  16.7  kDa.  A population  of  protein  was  detected  at  the  [M+2H]  value  for  the 
rPI-3,  8.4  kDa. 
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concentration  of  the  pepsin  used  in  the  assay.  Peter  Henderson  had  proposed  another 
method  to  more  accurately  represent  the  inhibitor  dissociation  constant  when  the  inhibitor 
Ki  value  is  within  100-fold  of  the  enzyme  concentration  (Henderson,  1972).  The  tight- 
binding  inhibition  model  suggests  that  the  total  inhibitor  content  is  a mixture  of  both  free 
inhibitor  and  enzyme-bound  inhibitor.  This  contrasts  with  the  case  in  which  the  Kj  is 
greater  than  1 00-fold  the  value  of  the  enzyme  concentration.  In  this  latter  case,  the 
inhibitor  is  essentially  free,  which  is  an  assumption  required  for  the  Michaelis-Menten 
analysis.  The  tight-binding  inhibition  method  required  the  collection  of  initial  rate  data 
for  multiple  concentrations  of  the  inhibitor  reacting  with  invariant  concentrations  of  the 
enzyme  and  substrate.  A decrease  in  enzyme  rates  was  measured  with  increasing 
inhibitor  concentrations  (Figure  3-10),  and  these  data  were  transformed  by  Henderson’s 
algorithm  (Equation  3 in  chapter  2)  to  reflect  the  more  accurate  inhibitor  affinity  for 
pepsin  under  these  tight-binding  conditions.  For  the  recombinant  PI-3  binding  to  the 
porcine  pepsin,  the  dissociation  const^lnt  was  determined  to  be  1.3  nM  and  was  not 
significantly  different  than  the  value  determined  before. 

The  same  methodology  was  performed  for  the  measurement  of  the  inhibition  of 
recombinant  cathepsin  E and  human  cathepsin  D.  Table  3-2  shows  the  similarity 
between  results  gathered  for  the  recombinant  protein  using  the  chromophoric  substrate 
assay  compared  to  the  results  obtained  previously  with  the  native  PI-3.  Pepsin  and 
cathepsin  D had  previously  been  assayed  using  hemoglobin  digestion  or  N-acetyl-L- 
phenylalanyl-L-diiodotyrosine  assays  (Abu-Erreish  and  Peanasky,  1 974a,  Keilova  and 
Tomasek,  1972).  Analysis  of  the  inhibition  of  cathepsin  E had  been  performed  using  a 
chromophoric  substrate  assay  (Jupp  et  al.,  1988).  The  trends  of  the  dissociation 
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[Inhibitor]  (M) 


Figure  3-10.  Tight-Binding  Inhibition  of  rPI-3  to  Pepsin 

Increasing  rPI-3  concentrations  in  the  reaction  decrease  the  rate  of  substrate  hydrolysis 
by  pepsin.  The  inset  figure  depicts  the  conversion  of  the  data  to  Henderson’s  tight- 
binding  inhibition  (Eq.  3)  algorithm.  The  dissociation  constant  for  porcine  pepsin  and 
rPI-3  was  determined  to  be  1.3  nM  +/-  0.2  nM. 
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constants  obtained  with  the  recombinant  protein  were  similar  to  those  from  different 
assays  on  the  native  PI-3.  These  data  continue  to  suggest  that  the  rPI-3  is  fully  functional 
without  any  post-translational  modifications. 


Table  3-2.  Dissociation  Constants  of  PI-3  and  Three  Aspartic  Proteinases 


Enzvme 

Ki  with  Native  PI-3  (nM) 

Ki  withrPI-3  (nM) 

Porcine  pepsin 

0.5" 

1.3  +/-  0.2 

Human  cathepsin  E 

lO'’ 

75  +/-  14 

Human  cathepsin  D 

N.D.'*  (>700)" 

N.D.'*  (2000) 

® Data  taken  from  Abu-Erreish  and  Peanasky,  1974a.  ” Data  taken  from  Jupp  et  al.,  1988. 

' Data  taken  from  Valler  et  al.,  1985.  N.D.  = Data  could  not  be  determined  at  the 
maximal  inhibitor  concentration.  Estimated  values  listed  in  parentheses  were  the  mean  of 
the  Ki  values  calculated  from  % inhibition  with  four  inhibitor  concentrations  using 
Equation  3 . 

The  results  in  Table  3-2  show  differences  between  the  native  and  wild-type  proteins 
in  the  affinity  of  the  enzymes  with  these  inhibitors.  The  affinity  of  rPI-3  to  pepsin  is 
nearly  2.5-fold  greater  than  the  reported  value  for  native  PI-3.  The  affinity  of  rPI-3  with 
cathepsin  E is  nearly  7. 5 -fold  greater  than  the  reported  value  for  the  native  inhibitor. 
These  discrepancies  may  be  explained,  in  part,  by  differences  in  the  inhibitors  and  in  the 
assays.  The  recombinant  protein  includes  two  amino  acid  residues  at  the  amino  terminus 
that  are  not  present  on  the  native  protein,  which  may  contribute  to  some  of  the  difference 
in  affinity.  The  assay  and  the  exact  conditions  of  the  measurements  of  inhibition  may 
contribute  to  some  of  the  differences  in  the  determined  affinities,  which  has  some 
precedence.  In  the  first  study  of  PI-3  by  Abu-Erreish  and  Peanasky  (1974a),  two 
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analyses  to  determine  the  Kj  were  used  which  resulted  in  values  of  0.10  and  0.91  nM 
using  the  same  protein.  The  analyses  involved  a hemoglobin  substrate  for  pepsin  and  a 
peptide  substrate  for  pepsin,  respectively.  Therefore  some  variability  in  absolute  affinity 
values  can  be  expected  when  using  different  assays.  For  this  reason,  all  measurements  of 
rPI-3  inhibition  were  performed  using  identical  conditions. 

Discussion 

The  development  of  a scheme  for  the  purification  of  rPI-3  in  the  laboratory  was 
sought  for  an  easier  means  of  producing  the  protein  than  had  been  previously  attainable. 
The  purification  of  rPI-3  could  be  used  to  confirm  the  behavior  of  the  protein  as  a tight- 
binding  inhibitor  of  aspartic  proteinases,  and  the  protein  could  be  produced  in  quantities 
great  enough  for  structural  characterization  to  be  attempted. 

The  secretion  expression  systems  were  initially  used  for  the  possible  ease  in  the 
purification  of  the  protein.  In  fact,  the  purification  procedures  for  the  two  systems  were 
simple,  and  some  purification  of  rPI-3  could  be  achieved  from  each  system;  however,  in 
neither  case  was  the  protein  purified  to  homogeneity.  The  procedures  were  improved, 
and  the  protein  obtained  was  shown  to  be  fully  active  against  pepsin.  These  expression 
systems  were;  however,  inadequate  for  the  production  of  necessarily  large  quantities  of 
rPI-3  required  for  structural  analyses. 

Purifying  the  protein  from  inclusion  bodies  was  designed  to  improve  the  yield  of 
the  protein  expression.  Purification  of  rPI-3  from  an  inclusion  body  preparation  provided 
the  greatest  purity  and  yield  of  protein  from  the  three  separate  methods.  The  protein 
purified  in  this  manner  consistently  yielded  between  3 and  4 milligrams  of  protein  fully 
purified  for  every  liter  of  culture  media.  The  yield  of  rPI-3  provided  enough  material  for 
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supplying  a collaborating  team  with  starting  protein  for  crystallographic  experiments 
(Petersen  et  al.,  1998). 

The  rPI-3  inclusion  body  preparation  was  more  heterogeneous  than  that  seen  in 
preparations  of  different  proteins  in  this  laboratory.  Three  non-exclusive  possibilities  for 
this  may  exist.  The  size  of  the  inhibitor  may  not  form  the  inclusion  bodies  as  easily  as 
the  larger  proteinases  studied  in  this  laboratory.  Previous  studies  on  correlating  protein 
aggregation  with  protein  size  have  not  revealed  any  clear  trends  (Wilkinson  and  Harrison, 
1991).  The  overexpression  of  some  E.  coli  proteins  may  overwhelm  the  number  of  heat 
shock  proteins  constitutively  expressed.  The  rPI-3  may  be  forming  some  disulfide  bonds 
with  E.  coli  proteins  if  incorrectly  folded  in  the  cytoplasm.  Though  the  formation  of 
disulfide  bonds  within  the  reducing  environment  of  the  cytoplasm  is  retarded,  the 
oxidation  of  cysteine  residues  to  form  disulfide  bonds  can  occur  within  the  E.  coli 
cytoplasm  (Rudolph  and  Lilie,  1 996). 

The  effects  of  the  protein  aggregation  are  consistent  with  the  discovery  made  by 
Kageyama  (1998)  that  the  PI-3  gene  that  contains  a leader  sequence  for  probable 
secretion  of  the  protein.  Such  a leader  sequence  may  target  the  protein  to  the  cell 
membrane  for  secretion  outside  the  cell,  where  the  protein  can  fold  into  the  proper 
conformation.  The  absence  of  this  sequence  may  also  contribute  to  the  lower  efficiency 
of  refolding  in  the  bacterial  cytoplasm,  leaving  more  of  the  protein  misfolded  and 
potentially  reactive  to  bacterial  proteins.  However,  it  is  possible  that  within  the  reduced 
environment,  the  additional  leader  sequence  may  not  aid  the  protein  folding  enough  to 
prevent  aggregation. 
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The  rPI-3  was  readily  tracked  through  the  purification  procedure  by  gel 
electrophoresis,  unlike  the  purification  using  the  secretion  systems.  The  inclusion  body 
composition  was  heterogeneous  as  can  be  seen  from  Figure  3-4.  The  contaminating 
proteins  appear  to  be  fully  removed  during  the  ammonium  sulfate  precipitation  and  gel 
filtration  chromatography  stages.  While  some  rPI-3  is  lost  in  the  40%  ammonium  sulfate 
precipitate,  a larger  portion  of  higher  molecular  weight  proteins  are  also  removed  from 
the  solution  at  that  stage.  At  70%  w/v  ammonium  sulfate,  the  presence  of  only  one 
protein  on  a Coommassie-stained  gel  is  visible.  Though  dilute  amounts  of  contaminating 
proteins  must  precipitate  with  rPI-3  due  to  the  presence  of  a second  peak  on  the 
chromatogram;  however,  the  proteins  eluting  in  the  smaller  peak  fractions  are  minor 
components  of  the  pre-column  protein  sample.  Following  the  gel  filtration 
chromatography,  rPI-3  appears  to  be  the  only  constituent. 

This  protein  was  fully  functional  as  a pepsin  inhibitor  and  behaved  in  a manner 
similar  to  the  native  protein  as  previously  studied  (Martzen  et  al.,  1991),  which  will  be 
described  further  in  Chapter  5 . The  differences  of  inhibition  between  the  three  aspartic 
proteinases  suggests  that  the  inhibitor  is  selective  toward  only  some  of  the  proteins  in  this 
class.  The  three  proteinases  shown  here  differ  in  the  amino  acid  composition  on  the 
active  site  surface,  where  substrates  and  inhibitors  might  bind,  and  elsewhere  on  the 
protein  surfaces.  These  proteinases  are  known  to  possess  different  substrate  specificities 
determined  by  the  amino  acid  compositions  in  the  active  site.  These  proteinases  also 
have  different  isoelectric  points  that  generally  correlate  with  the  environment  in  which 
the  proteins  are  commonly  active.  Pepsin  has  an  isoelectric  point  of  3.2,  and  it  is  secreted 
into  the  acidic  stomach.  Cathepsin  E and  cathepsin  D are  both  intracellular  proteinases 
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with  pi  values  of  4.1  and  5.6,  respectively.  These  data  also  support  the  analysis  of  the 
chemical  modification  to  rPI-3  by  Kageyama:  when  lysine  residues  were  modified,  the 
inhibition  was  decreased.  The  more  acidic  surfaces  of  pepsin  and  cathepsin  E might  form 
electrostatic  associations  with  the  basic  residues  of  rPI-3,  and  cathepsin  D might  form 
fewer  electrostatic  interactions  with  PI-3.  These  results  suggest  possible  mutations  to  be 
developed  in  order  to  discern  interactive  residues,  which  will  be  discussed  in  Chapter  5 . 
These  results  also  prompt  a greater  understanding  of  structural  characteristics  about  PI-3 
and  the  interactions  with  aspartic  proteinases. 


CHAPTER  4 

STRUCTURE  PREDICTIONS  AND  CONFORMATIONAL  STABILITY  OF  PI-3 

Introduction 

The  primary  structure  of  a protein  determines  the  secondary  structural  motifs  and 
the  folded,  tertiary  structure.  Without  experimental  knowledge  of  the  structure  of  a 
protein,  the  conceptually  infinite  number  of  possible  conformations  makes  predicting  the 
precise  protein  structure  an  impossible  challenge.  With  an  ever-growing  number  of 
experimentally  solved  protein  structures,  models  have  been  designed  and  are  being 
improved  to  be  able  to  predict  structural  information  about  a protein  based  on  little  more 
than  the  amino  acid  sequence.  Prediction  methods  are  typically  50-65%  accurate  for 
secondary  structure  motifs.  Predictions  regarding  tertiary  structures  are  still  not  possible 
unless  the  structure  of  a homologous  protein  has  already  been  solved.  Without  a 
structure  and  only  moderate  homology  to  other  protein  sequences,  predictions  about  the 
substructures  of  PI-3  have  been  made  based  on  the  amino  acid  sequence  of  PI-3,  and 
attempts  to  define  the  tertiary  structure  have  been  made. 

Proteinaceous  inhibitors  of  proteinases  have  been  shown  to  form  tightly  folded 
structures  with  strong  intramolecular  forces  that  stabilize  the  conformation.  The  folding 
and  unfolding  of  such  inhibitors  have  been  used  to  show  these  are  among  the  most 
thermodynamically  stable  of  proteins  studied.  Additionally  from  the  tertiary  structures  of 
solved  inhibitors,  these  proteins  are  known  to  form  tight,  globular  conformations,  with 
contributions  from  multiple  disulfide  bonds. 
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PI-3  has  neither  been  studied  for  conformational  stability  nor  has  a tertiary  structure 
been  solved.  PI-3  is  known  to  be  susceptible  to  digestion  by  the  proteolytic  enzymes 
trypsin  and  chymotrypsin  (Abu-Erreish  and  Peanasky,  1974b).  Disulfide  bond  studies  by 
Martzen  et  al.  (1990)  addressed  the  possible  roles  of  cysteine  residues  to  the  global 
structure.  Circular  dichroism  analysis  on  native  PI-3  was  also  performed  to  indicate  the 
protein  contains  a mix  of  secondary  structures. 

The  focus  of  this  chapter  is  on  structural  features  of  rPI-3  that  can  be  predicted  or 
determined  experimentally.  Secondary  structural  predictions  will  be  shown  to 
complement  the  circular  dichroism  data  for  rPI-3.  The  expression  of  the  protein  for 
crystallographic  analyses  will  be  described.  The  conformational  stability  of  the  tertiary 
structure  of  rPI-3  will  be  explored  by  urea-induced  protein  unfolding  experiments,  and 
the  contribution  due  to  the  disulfide  bonds  will  be  explored. 

Results 

Predictions  of  Structures 

The  sequence  of  rPI-3  was  analyzed  using  a variety  of  methods  designed  to  predict 
protein  structural  motifs.  Information  about  surface  residues  could  aid  in  the 
determination  of  PI-3  residues  that  interact  with  target  proteinases.  The  accessibility  of 
amino  acids  to  solvent  and  protein  secondary  structures  are  physical  properties  that 
contribute  to  the  global  conformation  of  a protein.  The  lack  of  sequence  homology  to 
any  protein  that  has  a solved  tertiary  structure  prevents  three-dimensional  models  from 
being  developed  for  PI-3.  While  the  tertiary  structure  can  not  be  predicted  for  PI-3, 
predictions  of  solvent  exposed  residues  and  secondary  structures  might  provide  insights 
to  the  topography  of  the  protein. 


75 


The  PHDacc  method  proposes  to  predict  amino  acid  solvent  accessible  surface 
areas  from  the  protein  sequence  (Rost  and  Sander,  1994).  This  model  presupposes  that 
most  of  the  hydrophobic  residues  will  be  internal  and  most  electrostatic  residues  will  be 
solvent  exposed  and  is  based  on  the  analyses  of  the  percent  solvent  exposure  for  all 
residues  from  238  globular  proteins.  The  most  significant  predictions  suggested  that  the 
majority  of  P-strand  regions  were  buried  and  much  of  the  a-helices  were  solvent 
exposed.  From  the  model,  the  protein  was  predicted  to  contain  nearly  equal  numbers  of 
residues  buried  as  exposed  (71  to  78)  and  predicted  to  be  globular  in  overall  shape.  The 
protein  was  also  predicted  to  have  a pi  of  5.1  based  on  the  amino  acid  content  using  the 
ExPASy  server  Compute  pi  analytical  program  (Wilkins  et  al.,  1998).  PI-3  was  analyzed 
by  isoelectric  focusing  on  a Fast-Gel  apparatus  with  a pH  3-9  lEF  gel,  and  the  isoelectric 
point  was  determined  to  be  5.5,  slightly  more  basic  than  the  prediction. 

Secondary  structure  prediction  methods  require  large  databases  of  known  protein 
structures  and  classically  base  the  predictions  on  a statistical  survey  of  the  proteins  in  the 
database.  Prediction  methods  are  moderately  successful  for  predicting  secondary 
structural  motifs  from  the  amino  acid  sequence  of  one  protein,  improvements  on  the 
accuracy  can  come  from  multiple  sequence  alignments  of  similar  proteins  and  the  use  of 
neural  network  computing  to  apply  forward  and  reverse  weighting  to  the  predictions. 

Multiple  secondary  structure  prediction  methods  were  applied  to  the  sequence  of 
PI-3.  The  Gamier-Osguthorpe-Robson  model  (GOR4)  uses  a simple  statistical  analysis 
of  protein  structures  in  a database  to  predict  the  secondary  structures  based  entirely  on  the 
protein  input  sequence  (Gamier  et  al.,  1996).  The  methods  SOPMA,  Predator,  DSC,  and 
PHD  use  different  neural  networks  to  compute  the  predictions  of  secondary  stmcture. 
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organized  on  a ‘winner-take-all’  scheme.  The  multiple  steps  of  the  neural  networks 
commonly  analyze  each  amino  acid  in  relation  to  nearby  amino  acids  for  the  probability 
of  forming  helical,  strand  or  coil  structure,  with  the  highest  score  listed  as  the  predicted 
structure  (Geourjon  and  Deleage,  1995,  Frishman  and  Argos,  1997,  King  and  Sternberg, 
1996,  Rost  and  Sander,  1993).  The  determination  of  the  secondary  structural  content  is 
based  on  testing  the  comparison  of  each  amino  acid  to  a database  set  of  proteins  having 
known  structures.  SOPMA  is  the  only  method  that  includes  predictions  for  likely  P-tums 
in  the  protein,  symbolized  in  Figure  4-1  as  ‘T’.  When  the  probability  score  for  sequence 
in  the  PHD  method  is  below  50%,  the  secondary  structure  for  the  amino  acid  is  not 
determined,  symbolized  in  Figure  4-1  as  a period  (.).  These  four  methods  can  improve 
the  accuracy  when  including  multiple  alignments  of  proteins  with  high  sequence  identity; 
however,  the  proteins  and  putative  gene  products  of  PI-3  were  not  of  sufficient  identity  to 
improve  the  accuracy  of  the  predictions.  Such  algorithms  have  been  tested  on  protein 
sequences  from  samples  with  solved  structures,  and  these  commonly  show  50-65% 
accuracy,  from  the  simple  GOR4  method  to  the  neural  networked  methods. 

Figure  4-1  shows  the  results  of  the  sequence-based  predictions  of  secondary 
structure  of  PI-3.  These  five  prediction  methods  generally  agree  that  a majority  of  PI-3 
residues  form  random  coils  or  loops  at  similar  sequence  regions.  Regions  of  predicted 
helical  secondary  structure  are  similar  between  the  different  methods,  showing  2-3 
probable  helices.  Four  of  the  five  models  predict  a-helices  at  residues  37-44,  91-95,  and 
102-118.  Likewise,  regions  of  extended  P-strand  structures  are  similarly  predicted 
among  the  different  methods,  showing  3-5  probable  strands.  Segments  of  sequences 
having  greatest  consensus  among  the  five  methods  for  P-strand  prediction  are  at  residues 
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12-15, 20-22,  and  73-77.  From  these  secondary  structure  prediction  methods.  Table  4-1 
shows  the  percent  of  rPI-3  predicted  to  form  the  different  secondary  structures.  These 
five  methods  predict  that  |3-strands  are  less  common  in  PI-3  than  a-helices.  The  protein 
is  predicted  to  be  a mixture  of  a/p  structures  based  on  a consensus  of  these  results. 


Table  4-1 . Secondary  Structure  Predictions  of  rPI-3 


% Helix 

% Strand 

% Random 

% Turn 

% Undetermined 

GOR4 

25.5 

18.1 

56.4 

SOPMA 

30.2 

21.5 

42.9 

5.4 

Predator 

28.9 

14.8 

56.3 

DSC 

21.5 

11.4 

67.1 

PHD 

18.1 

16.1 

37.6 

28.2 

K2D 

25 

22 

53 

Circular  Dichroism  Spectropolarimetrv 

Circular  dichroism  can  be  used  to  experimentally  analyze  the  protein  for  secondary 
structures.  Circular  dichroism  measures  the  degree  that  polarized  light  is  displaced  to  the 
right  or  left  from  plane  polarized  incident  light  passing  through  a sample.  The  shift  of  the 
light  due  to  the  protein  is  dependent  on  repetitive  motifs  in  the  protein,  such  as  helices 
and  strands,  and  degree  of  displacement  is  reported  as  the  ellipticity. 

The  secondary  structural  predictions  in  Table  4-1  generally  agree  with  the  circular 
dichroism  analysis  performed  on  the  native  protein  (Martzen  et  al.,  1991).  That  study 
reported  the  data  in  terms  of  ellipticity  and  not  in  mean  residue  weight  ellipticity,  which 
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accounts  for  the  protein  concentration  and  is  more  reliable  for  comparing  proteins  of 
different  compositions  and  concentrations  than  a simple  ellipticity  measure. 

The  circular  dichroism  spectrum  of  rPI-3  was  determined  using  a JASCO  J-500c 
spectropolarimeter.  From  the  CD  spectrum  of  rPl-3  (Figure  4-2),  a positive  ellipticity 
maximum  was  detected  near  1 95  run,  and  a negative  ellipticity  minimum  was  detected  at 
208  run.  The  mean  residue  weight  ellipticity  approaches  zero  going  from  210  to  240  run 
and  forms  two  steps  with  inflections  near  215  and  224  run.  Model  p-strand  proteins 
exhibit  a positive  ellipticity  maximum  near  1 96  run  and  a single  negative  minimum  near 
212  nm,  while  spectra  of  model  helical  proteins  exhibit  a positive  ellipticity  maximum  at 
195  nm  and  two  negative  ellipticity  minima  at  208  nm  and  222  nm.  For  rPI-3,  the  195 
run  peak  does  not  differentiate  between  a and  p character,  but  rather  it  is  indicative  of  the 
presence  of  either  a or  P character.  The  minimum  at  208  run  suggests  the  presence  of 
helical  regions,  while  the  first  plateau  after  208  run,  with  an  inflection  at  2 1 5 run,  is  due 
to  P-strand  regions.  The  second  plateau  with  an  inflection  at  224  run  also  indicates  the 
presence  of  helix  content.  Taken  together,  these  data  suggest  that  the  protein  is 
composed  of  a mixture  of  a and  P secondary  structural  elements. 

The  wavelengths  at  which  the  ellipticity  maximum  and  minimum  were  detected  are 
similar  to  those  values  reported  for  the  native  protein.  An  ellipticity  maximum  was 
reported  at  193  run  for  the  native  protein,  within  2 run  of  the  reported  value  for  rPI-3.  An 
ellipticity  minimum  was  detectable  at  208  nm,  in  agreement  with  the  value  for  rPl-3.  The 
ellipticity  in  the  region  from  210-240  tun  steadily  approaches  zero.  The  native  protein 
CD  spectrum  suggested  that  native  PI-3  is  composed  of  both  a-helical  and  p-strand 
character,  which  is  also  predicted  from  the  CD  spectrum  of  the  recombinant  protein. 


Mean  Residue  Weight  Ellipticity 
(deg*cm^/dmoI) 
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Figure  4-2.  Circular  Dichroism  Spectrum  of  rPI-3 
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From  the  molar  ellipticity  data  collected  between  200-240  nm,  the  neural  network 
program  K2D  (Andrade,  et  al.,  1993)  was  used  to  predict  the  secondary  structures  of  rPI- 
3.  The  algorithm  predicted  rPI-3  to  have  a mixed  a/p  character  of  25%/22%.  The 
predictions  for  the  approximate  percent  helix/strand  are  similar  to  those  from  the 
sequence-based  predictions.  The  circular  dichroism  data  appeared  to  confirm  the 
sequence-based  predictions  that  proposed  the  mixed  secondary  structures  for  PI-3,  and 
the  similarity  between  these  methods  implies  that  the  approximate  regions  of  predicted 
secondary  structure  motifs  are  likely  to  form  into  those  secondary  structures. 

The  identity  of  multiple  aligned  sequences  can  improve  the  accuracy  of  these 
prediction  models.  The  sequence  identity  of  PI-3  is  less  than  25%  compared  to  the 
similar  nematode  proteins  and  gene  products  (Figure  1-2).  The  low  identity  precludes  the 
use  of  multiple  alignments  for  improved  accuracy  of  secondary  structure  prediction, 
requiring  as  much  as  40%  or  more  sequence  identity  (Rost  and  Sander,  1993).  Regions 
predicted  by  secondary  structure  methods  may  still  offer  some  insights  to  the  protein 
topography.  When  a comparison  of  the  regions  of  helical  content  are  observed  as  a 
helical  wheel  projection  (Figure  4-3),  the  predictions  show  a predominance  of 
hydrophobic  residues  on  one  face  of  the  wheel  and  a predominance  of  hydrogen  bonding 
residues  on  another  face.  Another  noteworthy  prediction  came  from  the  putative  P-strand 
regions,  which  were  predominantly  hydrophobic  residues  (70%)  with  the  inclusion  of  2-3 
cysteine  residues  in  predicted  P-strands.  These  two  comparisons  suggest  that  the 
majority  of  the  P-strands  will  form  internal  substructures  while  the  helices  are  more 
likely  to  be  surface  substructures  with  external  hydrophilic  residue  side  chains  and 
internal  hydrophobic  side  chains. 
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Figure  4-3.  Alpha-Helical  Wheel  Projections  of  Three  Predicted  Alpha-Helix  Segments 

Alpha-helical  wheel  projections  for  the  regions  predicted  to  have  an  80%  consensus  for 
a-helix  structure  from  five  prediction  methods  are  shown  above.  The  segments  are  (1.) 
residues  37-44  (VGKEFAAR),  (2.)  residues  91-95  (KYVRD),  and  (3.)  residues  102-118 
(AEVQTFREKIAAFEEQQ).  Residues  containing  side  chains  that  can  form  hydrogen 
bonds  are  shown  in  bold. 
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Preparation  for  Crystallographic  Analyses 

Multiple  preparations  of  rPI-3  totaling  more  than  35  mg  of  protein  had  been 
purified  using  the  scheme  described  in  Chapter  2 and  had  been  submitted  for 
crystallographic  experiments.  A collaborating  team  at  the  University  of  Alberta  has  used 
samples  to  develop  conditions  to  precipitate  rPl-3  and  a complex  of  rPl-3  and  pepsin. 
They  were  able  to  obtain  crystals  of  a PI-3:pepsin  complex  with  a space  group  of  C222| 
(Petersen  et  al.,  1998);  however,  x-ray  diffraction  had  been  poorly  resolved  due  to  the 
absence  of  a heavy  atom  crystal  derivative  to  aid  with  the  phasing  of  the  diffraction  data. 
PI-3  was  expressed  with  the  incorporation  of  selenomethionine  amino  acid  residues,  in 
which  selenium  replacement  of  the  methionine  sulfur  atom  can  be  used  as  an  internal 
heavy  atom  for  improvement  of  the  phasing  of  the  diffraction  data  (Hendrickson,  1991). 
The  protein  expression  required  a cell  line  deficient  in  the  metabolism  of  methionine,  and 
the  met  auxotroph  E.  coli  cell  line  B834[DE3]  was  chosen  for  protein  expression  to 
obtain  rPI-3  containing  only  selenomethionine.  The  expression  growth  media  was 
modified  by  the  absence  of  hydrolyzed  casein  (casamino  acids)  and  the  addition  of  1 00 
mg/L  selenomethionine  and  0.4%  glucose,  twice  the  typical  concentration.  The  cell 
growth  was,  as  expected,  greatly  retarded  by  this  procedure,  and  only  1.5  grams  of  cells 
per  liter  of  culture  were  recovered,  not  the  3-4  grams  per  liter  when  grown  in  media 
containing  casamino  acids.  The  same  purification  scheme  was  performed  as  before  but 
altered  to  include  dithiothreitol  in  all  buffers  to  prevent  the  oxidation  of  selenomethionine 
residues  within  the  protein.  The  yields  at  all  stages  of  the  process  were  less  than  what 
had  been  achieved  with  the  original  procedure,  and  only  1 .5  mg  of  protein  were  obtained 
per  liter  of  starting  culture.  This  material  was  also  submitted  to  the  collaborating  team  at 
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the  University  of  Alberta.  This  as  yet  has  not  yielded  enough  protein  for  the 
crystallographic  team  to  make  proper  crystals  to  continue  with  the  diffraction  studies. 

The  results  of  the  PI-3  :pepsin  complex  crystallization  and  diffraction  will  be  described  in 
Chapter  5.  Without  a set  of  crystals  for  the  uncomplexed  PI-3,  a detailed  protein 
structure  continues  to  remain  elusive. 

Stability  Study  of  rPI-3 

The  stability  of  the  tertiary  structure  of  rPI-3  was  assessed  through  denaturant- 
induced  protein  unfolding.  The  state  of  the  folded  protein  was  determined  by 
fluorescence  spectroscopy  of  the  excitation  of  the  one  tryptophan  residue  (Trp27)  at  295 
nm  and  measuring  the  resultant  emission  spectrum.  The  tryptophan  fluorescence 
emission  spectrum  had  a maximum  at  360  nm  when  the  protein  was  fully  folded,  but  in 
the  denatured  state  when  the  tryptophan  becomes  exposed  to  solvent,  the  maximum 
shifted  to  366  nm.  The  wavelength  of  350  nm,  at  which  the  greatest  fluorescence 
difference  between  the  native  and  denatured  states  of  the  protein  was  measured,  was  used 
to  collect  the  fluorescence  emission  data.  The  protein  was  shown  to  be  capable  of 
unfolding  upon  the  addition  of  7.2  M urea  and  refolding  when  diluted  ten-fold  into 
sodium  phosphate  buffer  (Figure  4-4).  To  be  sure  that  rPI-3  regained  activity  following 
the  refolding,  equal  concentrations  of  rPI-3  inhibited  pepsin  to  the  same  degree  whether  it 
had  been  unfolded  and  refolded  or  never  unfolded.  This  behavior  reflected  a true 
equilibrium  between  the  unfolded  and  folded  states  of  the  protein. 

A series  of  concentrations  of  urea  from  0-9.3  M were  prepared  into  which  rPI-3  was 
diluted.  The  degree  of  protein  unfolding  was  determined  for  each  of  the  protein  samples 
after  one  hour  to  establish  and  stabilize  the  equilibrium  between  the  native  and  denatured 
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states.  The  data  were  normalized  to  reflect  the  fraction  of  unfolded  protein  (closed 
circles).  A best  fit  sigmoidal  curve  was  determined  for  the  fluorescence  emission  data 
that  indicated  the  protein  was  imdergoing  a cooperative  unfolding  transition  due  to  the 
presence  of  urea  (Figure  4-5). 

A two-state  transition  was  assumed  for  the  protein  imfolding,  and  data  in  the 
transition  reflected  a mixed  population  of  the  proteins  in  the  native  and  denatured  states 
in  a dynamic  equilibrium.  The  data  in  the  transition  of  the  curve  were  converted  to  terms 
for  unfolding  equilibrium,  K<jn,  and  free  energy  of  denaturation,  AGdn,  from  equation  6, 
shown  in  chapter  2.  The  data  for  concentration  of  urea  versus  free  energy  were  plotted 
(Figure  4-6),  and  a best  fit  line  through  the  data  was  extrapolated  to  the  y-axis.  This 
extrapolated  value  is  the  term  for  the  free  energy  of  rPI-3  stability  in  water,  AG(H20),  and 

was  determined  to  be  10.1  kcal/mol.  The  free  energy  of  rPI-3  stability  was  used  to 
determine  the  concentration  of  urea  at  which  half  the  proteins  have  unfolded,  [ureai/2], 
from  equation  7.  The  resulting  value  for  the  midpoint  of  the  transition  is  6.46,  shown  in 
Table  4-2. 


Table  4-2.  Thermodynamic  Stability  of  rPI-3 


AG(H,ny  (kcal/mol) 

fureai/2l  (M) 

Oxidized  rPI-3 

10.1 

6.46 

Reduced  rPI-3 

6.7 

5.22 

The  experiment  was  repeated  with  the  addition  of  P-mercaptoethanol  to  the  urea  to 
measure  the  stabilizing  effect  of  the  disulfide  bonds  to  the  conformational  stability  of  the 
protein.  The  unfolding  of  PI-3  was  observed  to  begin  at  a lower  urea  concentration  in  the 
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Figure  4-5.  Unfolding  Curves  for  rPI-3  in  Urea  in  Reduced  and  Oxidized  Conditions 

Tryptophan  fluorescence  emission  spectra  were  shown  to  have  greatest  difference 
between  the  folded  and  unfolded  states  at  350  nm.  Fluorescence  intensity  values  at  350 
nm  for  rPI-3  in  different  concentrations  of  urea  were  normalized  to  fraction  unfolded  and 
are  shown  above.  The  two  sets  of  data  reflect  protein  in  the  presence  (open  circles)  and 
absence  (closed  circles)  of  35  mM  (3-mercaptoethanol. 
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Urea  Concentration  (M) 


Figure  4-6.  Free  Energy  of  Stability  for  rPI-3  under  Oxidized  and  Reduced  Conditions 

Closed  circles  represent  free  energy  data  obtained  from  protein  unfolding  performed  in 
oxidized  conditions,  and  the  open  circles  represent  free  energy  data  obtained  from  protein 
unfolding  in  reduced  conditions.  Lines  were  determined  by  least  squares  best  fit  of  the 
data,  and  extrapolated  to  the  y-axis  to  represent  the  AG(H20>- 
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presence  of  the  reducing  agent  (Figure  4-5).  As  before  the  protein  was  analyzed  for 
function  and  was  fully  active  after  being  unfolded  in  the  presence  of  p-mercaptoethanol 
and  urea  and  refolded  by  dilution.  The  free  energy  of  unfolding  decreased  for  the  protein 
in  P-mercaptoethanol  compared  to  the  value  under  oxidized  conditions  (Table  4-2).  The 
midpoint  of  denaturation  was  also  lower  for  rPI-3  in  reduced  conditions  compared  to  the 
oxidized  state.  The  difference  in  the  free  energy  of  stability  for  rPI-3  between  the 
oxidizing  and  reducing  conditions  suggests  that  the  disulfide  bonds  contribute  3.4 
kcal/mol  of  energy  to  stabilize  the  protein  structure.  While  the  disulfide  bonds  are  clearly 
important  to  PI-3  stability,  the  disulfide  bonds  may  not  be  required  for  protein  stability. 
These  analyses  do  not  predict  whether  or  not  the  disulfide  bonds  are  required  for  protein 
function. 

Proteolytic  Digestion  of  rPI-3 

The  sensitivity  of  rPI-3  to  the  proteinases  trypsin  and  endopeptidase  Lys-C  was 
analyzed.  Within  two  hours  of  mixing  rPI-3  with  the  enzymes  maintained  at  room 
temperature  and  pH  7.8,  digestion  of  rPI-3  had  begun,  and  was  detectable  by  SDS 
polyacrylamide  gel  electrophoresis.  The  complete  digestion  of  the  protein  could  be 
observed  in  this  manner  after  22  hours  of  incubation.  These  results  agree  with  previous 
analyses  showing  that  PI-3  is  susceptible  to  proteolytic  digestion.  The  ease  of  digestion 
suggests  that  the  protein  is  conformationally  flexible.  Certainly  segments  of  PI-3  that 
include  lysine  residues  are  solvent  exposed  and  can  be  hydrolyzed. 

Mass  Spectrometry  to  Define  Disulfide  Bond  Pairs 

To  confirm  that  the  disulfide  bonding  pattern  of  the  cysteine  residues  in  rPI-3 
reflected  the  pattern  in  the  native  protein,  the  proteolysis  of  rPI-3  was  performed.  As  had 
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been  shown  above,  rPI-3  is  susceptible  to  proteolytic  digestion  by  trypsin  and 
endopeptidase  Lys-C.  Trypsin  was  used  to  hydrolyze  rPI-3  because  of  the  specificity  for 
hydrolysis  of  peptide  bonds  to  the  carboxy-terminus  of  lysine  and  arginine  residues,  and 
endopeptidase  Lys-C  was  used  for  the  specific  hydrolysis  of  peptide  bonds  following 
lysine  residues.  Between  all  six  of  the  cysteine  residues  in  rPI-3  is  at  least  one  lysine 
residue.  Following  a 32-hour  digestion  by  trypsin,  all  of  the  full-length  structure  of  rPI-3 
had  been  hydrolyzed  to  smaller  fragments.  The  reaction  was  quenched  in  an  acidic 
buffer,  and  dilutions  of  the  suspension  were  made  and  mixed  with  the  ionization  matrix 
a-cyano-4-hydroxy  cinnamic  acid  for  MALDI-TOF  mass  spectrometry.  This  protocol 

was  performed  in  duplicate  with  one  trypsin  digest  performed  in  the  presence  of  P- 
mercaptoethamol  and  the  other  without  the  reducing  agent.  The  two  conditions  were 
prepared  in  order  to  compare  differences  in  the  mass  spectra  between  the  reduced  rPI-3, 
which  should  show  the  complete  digestion  profile  of  rPI-3,  and  the  oxidized  protein, 
which  should  show  the  masses  for  peaks  containing  disulfide-bonding  fragments. 

Table  4-3  shows  the  predicted  molecular  masses  for  the  fragment  peptides  of  the 
fully  hydrolyzed  and  reduced  protein.  The  peptide  bond  after  Lys61,  between  the  third 
and  fourth  cysteine  residues  (Cys59  and  Cys66),  was  not  predicted  to  be  hydrolyzed  due 
to  the  proline  residue  immediately  following  the  lysine  61.  Table  4-4  shows  the 
predictions  for  the  hydrolyzed  fragments  in  the  oxidized  state,  which  is  composed  of  two 
parts.  The  first  is  a list  of  the  possible  molecular  masses  for  fragments  containing  a 
disulfide  bonding  Cysl46,  the  sixth  cysteine,  to  each  of  the  other  cysteine-containing 
fragments.  Masses  listed  in  Table  4-4  for  fragments  3a  and  3b  are  presented  for  the 
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Table  4-3.  Predicted  rPI-3  Fragments  from  Trypsin  Hydrolysis^ 


# 

Mass  fDa) 

Sequence 

Position 

1 

3957 

lAAFEEQQENQPPSSGMPHGAVPAGGLSPPPPPSFCTVQ 

111-149 

2'’ 

2589 

DNQVFVANLPWTMLEGDDIQVGK 

17-39 

3 

2201 

HDMAPTCTKPPPFCGPQDMK 

53-72 

4 

2038 

MTQFLFSMSTGPFICTVK 

1-16 

5 

1415 

MFNFVGCSVLGNK 

73-85 

6" 

1102 

DHAEVQTFR 

100-108 

f 

907 

VEDCTNVK 

45-52 

8" 

763 

LFIDQK 

86-91 

9b 

593 

EFAAR 

40-44 

10*’ 

547 

DLTAK 

95-99 

11'’ 

437 

YVR 

92-94 

12'’ 

275 

ER 

109-110 

“For  endopeptidase  Lys-C  fragments,  all  are  the  same  with  the  exception  that  6 and  12 
(1360),  9 and  7 (1482),  and  1 1 and  10  (965)  are  combined  due  to  the  lack  of  hydrolysis 
after  arginine  residues.  "Values  contain  no  cysteine  residues,  and  will  not  be  affected  by 
the  pairing  of  disulfide  bonds. 
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Table  4-4.  Predicted  Disulfide  Bonding  Peptides  for  the  Hydrolysis  of 
Oxidized  rPI-3  for  the  Pairing  of  Fragment  1 and  Predicted  Peptides 
for  the  Remaining  Cysteine-Containing  Fragments^ 


Fragment  Pair 
1+3 
l+Sa*" 
l+3b'’ 

1+4 

1+5“ 

1+6 

Mass  ('Da') 

6158 

4959 

5172 

5995 

5372 

4863 

Fragment  Pair 

Mass  (Da') 

3a+3b 

2201 

4+7 

2942 

3a+4 

3041 

3b+7 

2123 

3a+7 

1910 

3b+4 

3254 

3+4+7 

5144 

“From  the  mass  spectrum,  the  mass  corresponding  to  fragments  1+5  was  detected; 
therefore,  predictions  for  the  masses  combining  fragments  3, 4 and  7 were  made.'’3a  eind 
3 b refer  to  hypothetical  fragments  from  the  hydrolysis  of  firagment  3 at  the  lysine-proline 
bond. 
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possibility  that  fragment  3 might  be  hydrolyzed  at  the  Lys61-Pro62  peptide  bond.  The 
second  is  a list  of  all  possible  combinations  of  the  first  four  cysteine  residues  to  form 
disulfide  bonds  if  Cys79  and  Cysl46  form  a disulfide  bond,  as  had  been  determined  for 
the  native  protein  (Martzen  et  al.,  1990). 

The  results  of  the  mass  spectra  for  the  trypsin  hydrolysis  of  reduced  and  oxidized 
rPI-3  are  shown  in  Figure  4-7.  The  left  panel  shows  the  spectrum  for  hydrolyzed, 
reduced  rPI-3,  and  the  right  panel  shows  the  spectrum  for  hydrolyzed,  oxidized  rPI-3, 
containing  disulfide  bonding  peptide  fragments.  From  the  reduced  hydrolysis  of  rPl-3, 
the  following  masses  of  fragments  were  detected:  3957,  2589,  2201,  1415,  1 102,  and 
763.  From  the  mass  spectrum,  the  four  peptide  fragments  below  the  mass  of  750  Daltons 
were  not  detected.  The  other  two  peptide  fragments  were  missed,  2038  and  907,  both  of 
which  include  one  cysteine  residue.  Also,  no  other  mass  peaks  were  detected  above  the 
value  of  4500  Daltons.  As  was  predicted,  the  third  fragment  (2200)  that  included  Cys59 
and  Cys66  was  completely  intact,  showing  that  the  lysine-proline  peptide  bond  was  not 
hydrolyzed  in  the  process. 

The  oxidized  rPI-3  mass  spectrum  showed  two  peak  masses  greater  than  any 
detected  in  the  spectrum  of  the  reduced  fragments  of  rPI-3.  Mass  peaks  corresponding  to 
fragments  2589,  2200,  1 102,  and  763  were  again  detected.  The  detection  of  the  2200-Da 
fragment  in  both  the  reduced  and  oxidized  states  suggested  that  not  only  did  the  lysine- 
proline  bond  remain  intact  in  this  fragment,  but  the  two  cysteine  residues  might  form  a 
disulfide  bond  pair  or  the  cysteine  disulfide  bonds  were  reduced  before  detection  by  the 
mass  spectrometer.  The  masses  at  4165  and  4385  Daltons  were  not  predicted  from  any 
combination  of  disulfide  bonding  pairs,  and  the  origin  can  not  be  determined  for  these 
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mass  peaks.  The  mass  peak  at  5368  Da  corresponds  to  a peptide  including  a disulfide 
bond  between  Cys79  and  Cysl46,  showing  that  the  fifth  and  sixth  cysteine  residues  form 
a disulfide  bond.  The  remaining  peak  at  501 1 Da  does  not  directly  correspond  to  any  of 
the  remaining  possible  combinations  of  pairs  of  fragments.  The  closest  peptide 
combination  was  for  a mass  of  5144  Daltons,  which  would  reflect  the  combination  of  the 
reduced  peptide  fragments  2200, 2038,  and  907.  The  difference  between  the  predicted 
and  measured  masses  of  133  Daltons  is  nearly  the  mass  of  one  methionine  amino  acid 
(131  Da),  which  could  be  removed  during  the  protein  expression  by  a bacterial 
aminopeptidase.  No  peptide  with  a mass  near  2038  Da  was  detected  in  the  fully  reduced 
fraction,  thus  the  presence  of  the  amino-terminal  methionine  can  not  be  confirmed.  This 
fragment  would  include  the  remaining  four  cysteine  residues  that  could  form  one  of  two 
disulfide  bond  combinations:  Cysl3-Cys66  and  Cys48-Cys59  or  Cysl3-Cys59  and 
Cys48-Cys66.  The  sequence  between  the  third  and  fourth  cysteine  residues  is  C59-T-K- 
P-P-P-F-Cee  and  does  not  offer  any  better  option  for  proteolytic  hydrolysis  to  separate 
these  two  cysteine  residues.  These  data,  while  not  complete,  show  that  a disulfide  bond 
forms  between  Cys79  and  Cysl46.  The  data  predict  that  the  third  and  fourth  cysteine 
residues  do  not  form  a disulfide  bond,  leaving  only  two  possible  combinations  of  the  first 
four  cysteine  residues. 

These  mass  spectral  analyses  were  repeated  for  the  reduced  and  oxidized  rPI-3 
hydrolysis  by  endopeptidase  Lys-C  (Figure  4-8).  The  predicted  masses  for  the  fragments 
1-5  and  8 are  identical  to  those  for  trypsin  shown  in  Table  4-3,  but  only  three  other 
fragments  (1482, 1360,  and  965  Da)  are  predicted  because  endopeptidase  Lys-C  does  not 
hydrolyze  bonds  following  arginine  residues.  As  was  seen  before,  the  lysine-proline 
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bond  between  the  third  and  fourth  cysteine  residues  is  not  hydrolyzed  in  the  reduced 
state.  The  peptide  at  763  Da  was  not  detected  in  any  of  these  analyses.  All  other  mass 
fragments  could  be  detected  in  the  fully  reduced  form,  including  the  mass  of  1 907 
Daltons  corresponding  to  the  first  1 6 residues  of  rPI-3  without  the  amino-terminal 
methionine.  Though  the  peptide  was  not  detected  using  the  trypsin  process,  the 
endopeptidase  Lys-C  hydrolysis  clearly  showed  the  mass  peak,  suggesting  that  the 
amino-terminal  methionine  is  hydrolyzed  from  PI-3.  This  result  supports  the  conclusion 
above  that  the  trypsin  hydrolysis  of  the  oxidized  protein  generates  a fragment  that  lacks 
the  amino-terminal  methionine  and  contains  the  first  four  cysteine  residues  through  two 
disulfide  bonds.  For  the  endopeptidase  Lys-C  hydrolysis  of  rPI-3  under  oxidized 
conditions,  the  peak  corresponding  to  a peptide  formed  by  the  disulfide  bond  between 
Cys79  and  Cysl46  was  detectable.  A mass  peak  corresponding  to  the  four-cysteine- 
containing  peptide  was  marginally  visible  as  was  a peak  corresponding  to  peptide 
fragment  2589. 

These  results  agree  with  the  discovery  of  the  disulfide  pairs  of  the  native  Pl-3 . The 
disulfide  pair  determination  of  native  PI-3  was  made  by  cyanogen  bromide  cleavage 
followed  by  automated  Edman  degradation.  The  resulting  disulfide  bond  pairs  for  native 
Pl-3  \vere  Cysl3-Cys59,  Cys48-Cys66,  and  Cys79-Cysl46.  The  most  definitive  result 
for  the  recombinant  protein  was  the  determination  of  the  Cys79-Cysl46  bond.  Of  the 
combinations  for  the  other  four  cysteine  residues  into  disulfide  bonding  pairs,  two  remain 
probable  due  to  the  difficulty  in  separating  the  Cys59  and  Cys66  from  the  same  peptide. 
These  results  imply  that  the  protein  has  two  possible  domains  based  on  the  two  disulfide 
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bonds  in  the  amino  terminal  half  of  the  protein  and  the  one  disulfide  bond  in  the  carboxy- 
terminal  half  of  the  protein. 

Discussion 

Predictions  of  a protein  based  on  limited  information  must  be  considered 
cautiously.  With  many  types  of  prediction  methods  using  different  databases  and 
algorithms,  the  field  of  protein  predictions  continues  to  expand.  Many  of  these  methods 
do  follow  some  of  the  same  assumptions.  These  are  all  based  on  some  limited  set  of 
proteins,  the  structures  of  which  are  known.  The  databases  of  proteins  are  used  to 
generate  statistical  expectations  for  a series  of  amino  acid  residues,  and  how  those 
residues  influence  the  others  nearby  to  form  a secondary  structure  motif  As  statistical 
tools,  these  methods  are  not  completely  reliable.  In  fact  these  methods  are  often  no  better 
than  75-80%  accurate  predicting  the  correct  secondary  structural  motifs  for  proteins  that 
are  within  the  database  (Cuff  and  Barton,  1 999).  Certainly  this  would  imply  that 
predicting  structural  characteristics  of  novel  proteins,  from  which  the  models  were  not 
developed,  would  be  less  accurate.  In  fact,  predictions  of  proteins  that  are  not  included  in 
the  databases  and  that  have  solved  structures  are  50-65%  accurate  for  identifying  amino 
acids  with  the  type  of  secondary  structures. 

The  analyses  of  PI-3  by  the  5 protein  prediction  models  overlap  at  many  regions  of 
the  protein.  Some  of  this  may  have  to  do  with  similarities  between  the  methods  used  to 
define  these  parameters,  and  this  might  also  be  part  of  the  reason  for  the  generally 
consistent  accuracy  of  these  methods  between  50-65%.  The  different  methods  used 
returned  estimates  of  helical  content  between  18-30%,  and  strand  content  at  1 1-21%.  In 
every  prediction  method  using  amino  acid  sequence,  the  helical  content  was  greater  than 
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Strand  content.  The  comparison  of  these  data  with  the  prediction  of  structural  content 
based  on  the  circular  dichroism  spectrum  of  rPI-3  were  similar,  indicating  more  helix 
content  (27%)  than  strand  content  (2 1 %)  in  the  protein.  The  similarity  in  the  percentage 
of  amino  acids  forming  specific  secondary  structures  suggests  that  the  sequence-based 
models  are  likely  to  be  fairly  accurate  for  the  prediction  of  secondary  structures  of  PI-3. 

The  tertiary  structure  of  rPI-3  is  stable.  The  unfolding  of  rPI-3  required  a high 
concentration  of  urea  ([ureai/2]  = 6.46  M),  and  the  free  energy  term  (10.1  kcal/mol)  was 
consistent  with  other  stable  proteins.  (3-mercaptoethanol  was  used  in  the  unfolding 
experiments  to  show  that  cysteine  residues  are  important  for  the  thermodynamic  stability, 
but  are  not  entirely  responsible  for  the  structural  stability.  If  other  small  proteinase 
inhibitors  are  reasonable  models,  the  protein  may  be  primarily  composed  of  hydrogen 
bonding  groups  exposed  to  solvent  and  hydrophobic  groups  within  the  core  for  such  a 
strong  stable  protein  structure. 

Comparisons  to  proteins  in  the  literature  suggest  that  PI-3  is  tightly  folded  and 
thermodynamically  stable.  Bovine  pancreatic  trypsin  inhibitor  has  been  thoroughly 
examined  as  an  inhibitor  and  as  a model  for  protein  structure  and  conformational 
stability.  The  importance  of  the  three  disulfide  bonds  has  been  measured  for  BPTI  with 
chemical  blocking  and  mutations  to  the  different  disulfide-bonding  residues.  The 
stability  of  BPTI  has  been  determined  by  fluorescence  unfolding  to  be  14.3  kcal/mol 
(Schwartz,  et  al.,  1987).  One  disulfide  pair  has  been  implicated  as  the  critical,  slow  step 
in  the  folding  mechanism.  When  that  pair  of  cysteine  residues  is  modified,  the  stability 
decreases  to  approximately  6.0  kcal/mol,  depending  on  the  modifications  to  the  residues. 
The  active,  amino-terminal  domain  of  tissue  inhibitor  of  metalloproteinases-1  has  also 


100 


been  examined  for  structural  stability  by  denaturant-induced  unfolding  and  shown  to 
have  a stability  of  1 1.8  kcal/mol  (Williamson,  et  al.,  1994). 

Two  other  examples  of  proteins  that  have  been  studied  for  conformational  folding 
properties  are  lysozyme  and  ribonuclease,  which  are  globular  proteins.  These  have  been 
analyzed  by  urea-induced  unfolding  experiments.  Lysozyme  was  determined  to  have  a 
[ureai/2]  = 5.21  and  a free  energy  of  stability  of  5.8  kcal/mol  (Chen  and  Schellman, 
1989).  Ribonuclease  T1  had  a higher  [ureai/2]  of  6.96,  but  the  free  energy  of  stability  for 
ribonuclease  T1  was  7.7  (Pace,  1975).  The  free  energy  of  stabilization  is  lower  for  these 
two  proteins  than  for  rPI-3,  even  though  the  midpoint  of  urea  denaturation  of 
ribonuclease  T1  is  higher  than  that  for  rPI-3.  These  results  show  that  the  midpoint  of 
urea  denaturation,  [ureai/2],  is  not  correlated  directly  with  the  free  energy  of  protein 
stability.  To  compare  protein  stabilities,  the  free  energy  term  can  be  acquired  from  the 
different  methods  of  measuring  protein  conformational  stability.  Pace  has  reviewed 
studies  of  protein  conformational  stability  and  reported  that  most  proteins  studied  have  a 
free  energy  stabilization  of  5-14  kcal/mol  (Pace  et  al.,  1990). 

The  free  energy  of  conformational  stability  (10.1  kcal/mol)  for  rPI-3  is  within  the 
range  for  globular  proteins,  though  the  value  is  a bit  lower  than  that  for  other  proteinase 
inhibitors.  The  free  energy  of  conformational  stability  for  rPI-3  is  4.2  kcal/mol  less 
stable  than  that  for  BPTI,  which  might  account  for  the  greater  susceptibility  of  PI-3  to 
proteolytic  digestion,  whereas  BPTI  is  resisteint  to  most  proteolytic  enzymes.  The 
structure  of  BPTI  is  composed  of  a hydrophobic  interior  and  surface  that  is  almost 
entirely  composed  of  hydrogen  bonding  groups.  The  lower  energy  stability  for  rPI-3 
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might  be  the  physical  result  of  more  flexible  loops  or  a greater  prevalence  of  hydrophobic 
residues  on  the  surface  and  slightly  less  compact  hydrophobic  interior. 

The  models  of  other  inhibitors  also  suggest  that  hydrogen  bonding  may  be  more 
important  to  inhibitor  interactions  with  enzymes  than  for  monomers  with  a functional 
oligomer.  Again  this  suggests  that  the  protein  will  require  hydrogen  bonding  groups  on 
the  surface  for  interacting  with  target  proteinases.  The  helical  wheel  projection  showed 
that  the  predicted  helices  could  form  amphipathic  helices  on  the  surface  of  the  protein. 
The  expectation  is  that  the  hydrophobic  residues  on  one  face  of  the  a-helices  will  be 
buried  within  the  hydrophobic  core  of  the  protein,  composed  of  the  P-strands  and  loop 
regions.  This  hydrophobic  interior  will  contribute  to  the  conformational  energy  of  the 
protein.  The  polar  residues  will  form  another  face  of  the  helices  and  be  exposed  to 
solvent  to  contribute  to  the  protein  stability  through  hydrogen  bonds  to  water  molecules. 
The  three  disulfide  bonds  contribute  the  remaining  required  energy  for  the 
conformational  stability  of  rPI-3. 

The  predicted  polar  surface  residues  may  also  contribute  to  the  binding  affinity  of 
PI-3  to  target  proteinases.  Lysine  residues  are  predicted  to  be  present  on  the  surface  from 
the  ease  of  trypsin  and  endopeptidase  Lys-C  hydrolysis  of  rPl-3 . Also  predicting  the 
probability  of  hydrogen  bonding  to  be  critical  for  enzyme  inhibition,  four  lysine  residues 
were  shown  by  Kageyama  to  be  surface  exposed  and  candidates  as  reactive  site  contact 
residues  with  target  proteinases.  The  interaction  of  rPI-3  with  proteinases  will  be 
explored  in  chapter  5. 


CHAPTER  5 

THE  SPECIFIC  ASSOCIATION  OF  PI-3  WITH  ASPARTIC  PROTEINASES 

Introduction 

With  the  absence  of  a solved  structure  for  the  heteroduplex  of  pepsin  and  rPI-3,  an 
exploration  to  discover  binding  regions  of  rPI-3  and  the  aspartic  proteinases  was 
performed.  Some  basic  mechanistic  questions  were  investigated  using  experiments 
designed  to  measure  physical  properties  of  the  complex  of  pepsin  and  rPI-3.  The 
simplest  mechanism  of  proteinase  inhibition  by  PI-3  can  be  described  by  the  formation  of 
an  enzyme-inhibitor  complex  (El)  from  the  enzyme  (E)  and  inhibitor  (I)  in  the 

equilibrium  equation  [E]  + [I]  [El].  The  affinity  of  the  inhibitor  for  the  enzyme  in  this 

model  is  expressed  by  the  dissociation  of  the  complex  and  is  defined  as  Kj  = ([pepsin]  * 
[PI-3])/  [PI-3:pepsin].  The  two  variables  in  this  equation,  pepsin  and  PI-3,  were  altered 
to  observe  changes  in  the  association  of  the  proteins.  Different  aspartic  proteinases  were 
measured  for  affinity  to  PI-3,  and  mutated  forms  of  rPI-3  were  expressed  to  probe  critical 
residues  for  the  specific  affinities  to  the  different  proteinases. 

This  chapter  describes  experiments  designed  to  characterize  the  mechanism  of 
inhibition  of  PI-3  using  gel  filtration  and  gel  electrophoresis  techniques.  Contributions  to 
the  inhibition  model  from  limited  crystallographic  data  are  described.  Mutations  to  the 
inhibitor  were  used  to  examine  residues  that  might  contribute  to  the  protein  recognition 
and  strong  association  with  pepsin  and  provide  additional  support  for  a model  of  the 
inhibition  mechanism. 
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Results 

Complex  Formation  of  PI-3  and  Pepsin 

In  an  effort  to  investigate  the  mechanism  of  interaction  between  pepsin  and  PI-3, 
heteroduplex  proteins  were  formed  imder  conditions  known  to  promote  the  association. 
Mixing  the  two  proteins  together  at  approximately  equal  concentrations  at  pH  3.5 
generated  complexes  of  pepsin  and  PI-3.  The  formation  of  these  complexes  was 
analyzed  by  gel  filtration  chromatography. 

Separate  chromatographic  elutions  of  the  pepsin  (35  kDa),  rPI-3  (16.7  kDa),  and 
complex  were  performed;  between  each  elution  the  column  was  thoroughly  cleaned  to 
prevent  contamination  from  one  protein  to  the  next.  Figure  5-1  shows  the  superposition 
from  the  elutions  of  the  three  proteins  that  the  complex  could  be  differentiated  from  the 
two  components  based  on  the  elution  volume  from  a Superdex  75  gel  filtration  column. 
The  order  of  the  elution  volumes  for  these  proteins  was  the  pepsin:PI-3  complex,  the  PI-3 
monomer,  and  the  pepsin  monomer.  The  complex,  the  largest  of  these  macromolecules, 
eluted  as  predicted  from  the  size  exclusion  column  earlier  than  did  pepsin  or  PI-3 . 
Surprisingly,  pepsin  eluted  from  the  column  later  than  rPI-3,  even  though  it  had  a mass 
nearly  twice  that  of  PI-3 . This  could  be  due  to  oligomeric  PI-3  formation;  however,  as 
was  shown  in  chapter  3,  rPI-3  elutes  according  to  its  molecular  mass  from  the  gel 
filtration  column.  This  odd  behavior  might  otherwise  be  explained  that  pepsin  possesses 
an  extraordinarily  tight  fold  and  a slightly  smaller  hydrodynamic  volume  than  the 
inhibitor,  which  is  unlikely,  or  pepsin  might  be  transiently  associating  with  the 
chromatography  resin.  Though  the  pepsin  elution  was  anomalous,  the  complex  was 
shown  to  be  chromatographically  separated  from  the  two  components.  This  suggests  that 
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the  interaction  between  pepsin  and  PI-3  is  tight,  and  little  or  no  dissociation  was  observed 
during  the  chromatography.  Additionally,  samples  were  prepared  in  which  an  excess  of 
pepsin  was  mixed  with  rPI-3,  and  the  separation  of  the  complex,  at  64  ml  elution  volume, 
from  free  pepsin  (96  ml)  through  the  Superdex  75  gel  filtration  column  was  achieved. 
This  result  also  implied  that  the  pepsin:PI-3  complex  could  be  purified  from 
contaminants  and  uncomplexed  pepsin  and  PI-3  for  the  improvement  of  the  formation  of 
crystals  in  order  to  define  the  interaction  of  these  proteins  through  x-ray  diffraction 
analysis. 

Protein  heterodimers  were  also  separated  onto  SDS  polyacrylamide  gels  to 
determine  if  they  were  covalent  duplexes  or  noncovalent  associations.  Fractions 
collected  from  the  analytical  gel  filtration  of  the  complex  were  analyzed  by  reducing  SDS 
polyacrylamide  gel  electrophoresis,  and  separate  bands  of  the  proteins  pepsin  and  PI-3 
were  observed,  as  were  two  bands  of  smaller  proteins,  but  not  a higher  molecular  mass 
band  (Figure  5-2).  The  absence  of  a band  of  molecular  mass  near  50  kDa  showed  that 
the  proteins  were  not  forming  covalent  complexes.  The  presence  of  the  two  smaller 
bands,  at  approximately  6 kDa  and  8 kDa,  implied  that  the  PI-3  was  being  hydrolyzed  by 
pepsin  at  some  stage  of  the  interaction.  In  another  experiment,  pepsin  and  rPI-3  were 
mixed  to  form  complexes  at  room  temperature,  and  at  times  from  30  seconds  to  30 
minutes,  aliquots  were  removed  and  immediately  quenched  in  Laemmli  sample  buffer. 
The  hydrolysis  of  PI-3  was  detected  at  all  times  of  incubation.  This  hydrolysis  of  PI-3 
suggested  a common  mechanism  for  inhibition  where  an  exposed  reactive  site  loop  from 
the  inhibitor  binds  directly  through  the  active  site  and  a single  bond  is  specifically 
hydrolyzed  by  the  target  enzyme. 
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Figure  5-2.  Complex  of  Pepsin  and  rPI-3  Separated  by  SDS-PAGE 

Protein  from  one  major  peak  at  64  milliliters  volume  were  collected  after  a 1 : 1 molar 
ratio  of  pepsin  and  rPI-3  had  been  mixed  and  passed  through  a Superdex  75  gel  filtration 
column.  All  lanes  shown  are  from  the  major  protein  peak  for  the  complex. 


107 


To  explore  the  possibility  of  identifying  the  reactive  site,  an  N-terminal  sequence 
analysis  was  performed  on  the  lower  two  bands  of  the  protein  gel.  Table  5-1  presents  the 
results  of  the  N-terminal  sequencing  of  the  upper  and  lower  bands  of  hydrolyzed  rPl-3. 
The  sequences  indicated  that  the  Leu3-Phe4  bond  had  been  cut.  This  would  be  expected 
from  the  substrate  specificity  for  hydrophobic  residues  in  the  P 1 and  P 1 ’ subsites  of  the 
enzyme  and  the  generally  flexible  nature  of  residues  near  the  termini  of  proteins.  The 
other  two  sites  were  between  Met73-Phe74  and  Leu86-Phe87.  While  both  of  these  pairs 
of  residues  are  hydrophobic  and  thus  good  substrates  for  pepsin,  these  pairs  would  be 
predicted  to  be  buried  based  on  the  hydrophobic  side  chains.  Due  to  the  consistency  of 
the  proteolytic  hydrolysis  over  long  periods  of  time  (Figure  5-3),  the  sites  of  proteolysis 
of  PI-3  were  probably  not  the  result  of  completely  random  hydrolysis  of  the  entire 
protein.  The  specificity  of  hydrolysis  suggests  these  sequences  were  near  the  active  site 
when  PI-3  was  bound  to  pepsin. 


Table  5-1.  N-Terminal  Sequencing  Results  from  Bands  of  Hydrolyzed  rPI-3 


N-terminal  Sequencing  Result® 

Band 

Phe4-Ser5-Met6-Ser7 

Lower  band 

Phe74-Asn75-Phe76-Val77 

Upper  band 

Phe87-Ile88-Asp89-Gln90 

Lower  band 

“Results  compiled  by  University  of  Florida  ICBR  Protein  Chemistry  Core 


The  formation  of  PI-3  ipepsin  complex  was  examined  further  to  ascertain  whether  or 
not  the  hydrolysis  was  truly  part  of  the  mechanism  of  enzyme  inhibition.  The  time  point 
assay  was  repeated,  but  following  the  incubation  of  enzyme  and  inhibitor  and  before  the 
addition  of  LSB  to  the  time  point  fractions,  the  aliquots  were  quenched  with  either  a 50- 
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LSB,  and  a 0.1  M CAPS  buffer  pH  10.5  before  adding  LSB. 
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fold  excess  of  pepstatin,  a small,  tight-binding  pepsin  inhibitor,  or  with  a 0. 1 M CAPS  pH 
10.5  solution  at  which  pepsin  is  inactive.  If  PI-3  hydrolysis  were  required  by  the 
mechanism,  the  hydrolysis  would  be  expected  to  occur  before  the  addition  of  the 
pepstatin  or  the  pH  change,  and  the  hydrolysis  pattern  should  be  identical  to  the  previous 
data.  From  assays  of  pepsin,  5 minutes  of  incubation  time  is  sufficient  for  the  enzyme  to 
form  associations  with  PI-3;  therefore,  any  hydrolysis  of  PI-3  should  be  detectable  by  that 
time. 

The  results  (Figure  5-3)  show  that  adding  pepstatin  or  increasing  the  pH  above 
where  pepsin  is  active  greatly  reduces  the  hydrolysis  of  the  inhibitor.  Interestingly  the 
gel  of  the  quenching  by  excess  pepstatin  showed  more  hydrolysis  than  that  for  the 
inactivation  by  pH,  suggesting  that  the  pepstatin  had  to  compete  away  the  PI-3,  but  not 
all  of  the  PI-3  had  dissociated  before  addition  of  LSB.  The  pH  increase  directly 
interfered  with  pepsin  proteolysis  of  PI-3  by  inactivating  all  of  the  pepsin.  If  hydrolysis 
were  a requisite  mechanistic  step,  neither  the  rise  in  pH  nor  the  addition  of  pepstatin 
should  have  affected  the  degree  of  hydrolysis  after  the  complex  had  formed  and  a 
reactive  site  bond  had  been  hydrolyzed.  Therefore,  the  mechanism  does  not  proceed 
through  a hydrolyzed  inhibitor  intermediate. 

To  further  explore  the  pH  influence  on  preventing  PI-3  hydrolysis,  another  set  of 
PI-3:pepsin  complexes  was  prepared.  Pepsin  has  an  optimal  pH  in  vitro  of  3.5,  and  a pH 
activity  curve  suggests  that  pepsin  is  fully  active  in  a range  from  pH  2 to  pH  5 (Lin  et  al., 
1992),  with  catalytic  efficiency  decreasing  almost  linearly  from  pH  4.5  to  6.  The 
mechanism  of  pH  inactivation  of  the  PI-3  hydrolysis  was  explored,  in  which  the 
quenching  buffers  were  titrated  to  pH  2,  3,  3.4, 4.5,5,  and  8.5.  Aliquots  of  the  protein 
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complex  mixture  were  removed  to  tubes  containing  the  different  pH  quenching  buffers 
followed  after  five  minutes  by  the  addition  of  LSB,  which  is  buffered  to  pH  6.8  by  Tris- 
C1  and  present  at  50  mM  concentration  when  added  to  the  protein  samples.  Samples 
brought  to  pH  4.5,5,  and  8.5  before  adding  LSB  did  not  show  PI-3  hydrolysis,  while  the 
three  aliquots  mixed  with  buffers  at  pH  2,  3 and  3.4  resulted  in  PI-3  hydrolysis.  These 
results  confirm  that  even  at  the  pH  limits  of  pepsin  activity,  PI-3  hydrolysis  is  prevented. 
These  results  continue  to  suggest  that  PI-3  hydrolysis  is  not  a function  of  the  mechanism, 
but  rather  from  some  proteolysis  due  to  residual  pepsin  activity  in  the  denaturant  SDS 
and  the  reducing  environment  of  LSB.  The  hydrolysis  occurs  only  at  flexible  regions  of 
rPI-3  near  the  amino-terminus  and  near  the  disulfide  bonding  residue  Cys79,  the  side 
chain  of  which  might  be  reduced  quickly  thus  untethering  the  nearby  amino  acids  to 
become  accessible  to  pepsin  hydrolysis. 

The  results  show  that  the  inhibitory  mechanism  does  not  require  a covalent 
interaction  between  PI-3  and  target  enzymes.  These  results  also  suggest  that  the 
mechanism  of  inhibition  does  not  require  the  hydrolysis  of  a peptide  bond  as  seen  with 
other  proteinase  inhibitors,  but  not  all  proteinase  inhibitor  mechemisms.  The  result  of  the 
hydrolysis  in  the  original  study  must  have  been  an  artifact  of  the  denaturant  exposing  a 
segment  of  the  PI-3  to  pepsin  before  pepsin  had  been  completely  inactivated  by  the 
conditions  of  the  buffer  while  still  interacting  with  the  inhibitor.  Due  to  the  selectivity  of 
the  N-terminal  sequencing  results,  PI-3  digestion  was  likely  the  result  of  a near  surface 
area  of  PI-3  unfolding  that  was  near  the  active  site  of  pepsin,  not  a completely  random 
digestion  of  the  protein.  The  results  suggest  that  the  protein  does  not  have  to  fill  in  the 
active  site  of  the  proteinase  completely,  like  enzyme  prosegments  or  the  serpin  inhibitors; 
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the  latter  having  a bond  hydrolyzed  during  the  enzyme  inactivation.  The  protein  was  also 
shown  to  be  a tight-binding,  but  not  covalently  binding,  inhibitor  of  pepsin,  and  the 
mechanism  of  inhibition  must  only  involve  a combination  of  the  intermolecular  forces  of 
hydrophobic  packing  and  hydrogen  bonding. 

Analyses  of  Limited  Crystallographic  Data  from  PI-3  :Pepsin  Complex 

Crystallographic  studies  from  the  collaborating  team  at  the  University  of  Alberta 
have  suggested  that  some  electron  density  from  PI-3  appears  to  be  in  or  near  one  side  of 
the  active  site  of  pepsin.  Different  sets  of  diffraction  data  have  presented  different  views 
about  those  possible  contacts.  In  an  early  interpretation  of  incomplete  electron  density 
data  for  the  complex,  two  sequential  residues  of  a phenylalanine  and  arginine  from  rPI-3 
appeared  to  be  binding  within  the  prime  side  of  the  active  site  of  pepsin  (Petersen  et  al., 
1998).  The  only  sequential  phenylalanine  and  arginine  adjacent  residues  are  Phel07- 
ArglOS.  Additional  electron  density  for  rPI-3  was  detected  near  the  prime  side  of  the 
“flap”  of  pepsin  on  the  amino-terminal  lobe  of  pepsin  and  near  the  prime  side  of  the 
active  site  of  the  carboxyl-terminal  lobe  of  pepsin.  The  carboxyl-terminal  domain  of 
pepsin  was  observed  to  be  shifted  away  from  the  amino-terminal  domain  compared  to 
structures  of  the  non-bound  pepsin.  Though  the  exact  physical  interaction  contributing  to 
the  domain  shift  is  uncertain  due  to  the  incomplete  structure  for  this  complex,  the 
observation  predicts  that  the  inhibitor  contacts  both  domains  of  pepsin  and  induces  a 
conformational  movement  of  the  domains  of  the  enzyme.  This  observation  also  supports 
a role  of  the  inhibitor  preventing  substrate  from  accessing  the  enzyme  active  site  either  by 
directly  fitting  into  one  side  of  the  active  site  or  occluding  a region  of  the  active  site  from 


outside  the  active  site. 
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During  another  crystallization  of  the  complex,  rPI-3  had  been  hydrolyzed  into 
smaller  fragments.  These  crystals  were  diffracted,  and  data  were  refined  to  a greater 
resolution  than  previous  data  obtained  for  the  complete  complex.  These  crystals  did  not 
contain  the  complete  inhibitor,  and  only  electron  density  attributed  to  a small  peptide  was 
detected  within  the  active  site  of  pepsin.  The  peptide  was  determined  to  be  Val77-Gly78- 
Cys79-Ser80-Val81-Leu82  within  the  active  site  forming  a disulfide  bond  with  Cysl46- 
Thrl47-Vall48-Q149.  The  observed  disulfide  bond  between  these  two  cysteine  residues 
is  in  agreement  with  the  disulfide  pairing  assignment  shown  in  chapter  4.  The  electron 
density  data  suggested  that  the  carboxyl  oxygen  of  Leu82  may  contact  the  catalytic 
pepsin  residues  in  the  prime  side  of  the  active  site  with  the  preceding  three  residues 
forming  some  contacts  to  the  active  site.  These  results,  though  different  from  the  earlier 
crystal  diffraction  data,  are  not  necessarily  mutually  exclusive.  As  described  in  chapter  1 , 
other  proteinaceous  inhibitors  of  serine,  cysteine,  and  metallo-  proteinases  interact  with 
target  proteinases  through  more  than  one  segment. 

Based  on  the  composition  of  the  fragment,  three  peptides  were  synthesized  to 
functionally  characterize  the  affinity  of  this  peptide  to  pepsin.  A peptide  was  synthesized 
based  directly  on  the  peptide  from  the  diffraction  data,  Val-Gly-Cys-Ser-Val-Leu 
forming  a disulfide  bond  to  Cys-Thr-Val-Gln.  Another  peptide  was  synthesized  identical 
to  this  disulfide-bonding  peptide  but  modified  with  a C-terminal  amide  group  on  the 
leucine  to  better  reflect  the  state  of  a protein  that  would  only  have  the  carbonyl  oxygen 
not  the  complete  carboxylate  group  as  on  the  synthesized  peptide.  A third  peptide  was 
synthesized  as  the  hexamer  alone:  Val-Gly-Cys-Ser-Val-Leu  to  detect  if  an  influence 
from  the  disulfide  bond  contributed  to  stabilize  the  interaction.  These  three  peptides  were 
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analyzed  as  potential  inhibitors  of  pepsin,  and  the  dissociation  constants  are  presented  in 
Table  5-2.  The  only  one  of  the  three  synthesized  peptides  that  could  inhibit  the  pepsin 
hydrolysis  of  substrate  was  the  original  fragment  discovered  from  the  electron  density  of 
the  complex.  The  affinity  of  this  disulfide-linked  peptide  was  0.35  mM,  nearly  10^ 
weaker  association  than  the  full-length  PI-3  and  pepsin.  The  other  two  peptides  were 
unable  to  inhibit  pepsin  up  to  5 mM,  the  maximal  inhibitor  concentration  possible  in  the 
assay.  The  peptide  affinity  for  pepsin  was  also  measured  at  pH  2.0.  At  this  lower  pH,  the 
disulfide-linked  peptide  formed  2-fold  stronger  associations  with  pepsin  than  at  pH  3.5; 
however,  the  other  two  peptides  were  still  unable  to  inhibit  pepsin.  These  results  reflect  a 
requirement  for  the  presence  of  the  disulfide  bonding  cysteine  for  this  interaction.  The 
results  also  showed  that  the  carboxy  terminal  amide  reduced  the  binding  efficiency.  The 
latter  result  was  counterintuitive  as  far  as  this  peptide  mimicking  a region  within  the 
longer  protein  PI-3  because  an  amino  acid  within  the  full-length  protein  would  not  have  a 
free  a-carboxylate  group.  This  result  implies  that  either  the  mode  of  inhibition  by  this 
peptide  is  different  from  the  full-length  protein  or  that  additional  constraints  from  the  rest 
of  PI-3  might  aid  this  reactive  domain.  The  effect  of  this  peptide  has  not  been  shown  to 
reflect  the  same  mode  of  inhibition  as  the  full-length  protein;  however,  it  represents  a 
distinct  pepsin-inhibitory  domain  that  might  be  one  of  multiple  reactive  sites  of  PI-3 . 
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Table  5-2.  Affinity  of  Peptides  Based  on  Putative  Reactive  Site  for  Pepsin 

Peptide  Dissociation  Constant,  Kj  (mM)®  at 

pH  2.0  pH  3.5 


V al-Gly-Cys-Ser- V al-Leu 

0.171  0.03  0.35  10.05 

Cys-Thr-Val-Gln 

V al-Gly-Cys-Ser- V al-Leu 

>5*’ 

V al-Gly-Cys-Ser- V al-Leu-amide 

>5” 

Cys-Thr-Val-Gln 

^Measurements  made  at  either  pH  2.0  or  pH  3.5  in  0.1  M sodium  phosphate  buffer. 
“The  latter  two  peptides  were  unable  to  inhibit  pepsin  at  either  pH,  even  up  to  the 
maximum  concentration  in  the  assay  of  5 mM. 


Analyses  of  PI-3  Mutants 

Mutations  to  rPI-3  were  developed  to  continue  to  address  the  mechanism  of 
inhibition.  PI-3  had  been  shown  to  require  a non-covalent  interaction  specifically  with 
pepsin.  Only  a few  other  aspartic  proteinases  could  be  inhibited  by  PI-3  but  not  as 
strongly  as  pepsin  (Valler  et  al,  1985).  Reactive  sites  were  hypothesized,  in  part,  on 
limited  crystallographic  data  and  sequence  alignments  of  PI-3  and  putative  homologous 
proteins.  Most  of  the  mutations  were  chosen  based  on  the  following  reasons. 

The  first  two  sites  were  chosen  based  on  the  complex  study,  in  which  the  peptide 
bonds  between  residues  3-4,  73-74  and  86-87  had  been  hydrolyzed  by  pepsin.  The 
original  explanation  for  these  data  was  a model  that  included  a reactive  loop  from  PI-3 
extending  through  the  active  site  of  pepsin,  as  seen  with  the  small  inhibitor  pepstatin, 
some  of  the  prosegments  of  aspartic  proteinases,  and  even  the  mechanism  of  some 
inhibitors  of  other  proteinases.  The  focus  on  lysine  residues  was  made  by  another 
assumption  that  the  tight-binding  affinity  should  include  contributions  from  electrostatic 
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or  hydrogen  bonding  forces.  The  specificity  of  substrate  hydrolysis  by  two  aspartic 
proteinases  was  also  taken  into  account.  The  interaction  of  PI-3  with  the  proteinase 
cathepsin  D was  quite  weak,  while  pepsin  had  a strong  affinity  for  PI-3.  The  substrate 
specificities  of  pepsin  and  cathepsin  D have  been  explored  (Rao  and  Dunn,  1995, 
Scarborough  et  al.,  1993,  Beyer  and  Dunn,  1998),  and  the  P2  position  in  cathepsin  D will 
not  tolerate  substrates  with  positive  charges,  while  pepsin  will.  If  the  inhibitor  extended  a 
substrate-like  loop  into  the  enzyme  active  site,  residues  surrounding  the  hydrolyzed 
peptide  bonds  at  73-74  and  86-87  could  fill  in  the  subsites  with  the  SI  site  filled  by  either 
Met73  or  Leu86  and  the  SI’  site  by  either  Phe74  or  Phe87.  At  the  putative  P2  position 
for  both  sites  was  a lysine  residue,  K72  and  K85,  one  of  which  was  predicted  to  be 
critical  for  the  association  with  pepsin  and  lack  of  association  to  cathepsin  D.  The  critical 
lysine  residue  was  predicted  to  form  a hydrogen  bond  to  a glutamic  acid  or  glutamine  in 
the  S2  subsite  for  porcine  or  human  pepsin,  respectively.  Mutations  were  generated  at 
both  of  the  lysine  positions  72  and  85  to  glutamic  acid  and  leucine  residues.  The 
mutations  to  glutamate  and  leucine  were  based  on  the  assumption  that  PI-3  fit  within  the 
active  site  and  was  ineffective  to  cathepsin  D in  an  analogous  manner  to  the  lack  of 
substrate  hydrolysis  when  the  peptide  contains  a basic  residue  in  the  P2  position. 
Cathepsin  D substrate  binding  and  hydrolysis  is  most  efficient  when  the  P2  residue  is 
either  a leucine  or  glutamate. 

After  a report  was  published  regarding  the  chemical  modification  of  PI-3  to  reduce 
the  affinity  of  the  protein  to  pepsin,  additional  residues  were  chosen  to  be  mutated.  The 
report  by  Kageyama  (1998)  suggested  that  the  phenylisothiocyanate  derivitization  of  PI-3 
inactivated  the  protein.  From  the  proteolysis  of  the  modified  PI-3,  HPLC  separation  of 
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fragments  and  absorbance  measurements  showed  that  the  most  thoroughly  modified 
residues  were  K16,  K85,  K91,  and  K1 10.  These  were  predicted  to  be  surface  residues 
with  the  potential  to  hydrogen  bond  to  target  proteinases.  Since  the  K85  mutant  had  been 
developed  by  that  time  and  the  greatest  sequence  homology  between  the  different 
putative  nematode  proteins  was  in  the  carboxyl  half  of  the  protein,  the  residues  at  K9 1 
and  K1 10  were  also  mutated.  Leucine  and  glutamic  acid  residues  were  again  chosen  to 
observe  possible  effects  of  a bulky,  hydrophobic  residue  or  the  change  in  hydrogen 
bonding  side  chain  from  an  amide  to  a carboxylate  group. 

The  series  of  PI-3  mutants  were  genetically  engineered,  purified,  and  analyzed  for 
functional  inhibition  of  pepsin  as  well  as  other  aspartic  proteinases.  These  mutated 
proteins  were  purified  using  the  same  methods  as  for  the  wild-type  rPI-3,  and  the  yields 
were  similar  to  those  for  the  wild-type  protein. 

Pepsin  inhibition.  The  mutant  proteins  were  examined  in  spectrophotometric  assays 
as  inhibitors  of  pepsin.  The  mutant  proteins  were  able  to  inhibit  pepsin  (Table  5-3)  with 
Ki  values  similar  to  those  observed  previously  for  rPI-3  of  the  wild-type  sequence.  All  of 
these  dissociation  constants  determined  for  the  inhibitors  had  less  than  20%  error.  Most 
of  the  mutant  forms  did  not  deviate  significantly  from  the  value  for  the  wild-type  PI-3. 

The  greatest  exceptions  were  measured  for  both  forms  of  PI-3  mutated  at  K72.  The 
design  of  the  lysine-to-glutamic  acid  mutations  was  to  locate  residues  that  might  form 
hydrogen  bonding  or  electrostatic  interactions  to  stabilize  the  complex  by  removing  the 
stabilizing  amine  group  with  a destabilizing  carboxyl  group.  Converting  the  basic  side 
chain  to  an  acidic  side  chain  surprisingly  increased  the  affinity  for  PI-3  to  pepsin  by 
nearly  2.5-fold.  Due  to  the  small  change  in  the  affinity,  the  side  chain  of  K72  was  not 


117 


Table  5-3.  Dissociation  Constants,  Kj  (nM),  for  Mutant  rPI-3  Inhibitors  and  Pepsin 


Mutant  rPI-3 

Ki  (rMf 

Wild-Type 

1.3 

±0.21 

K72E 

0.55 

±0.17 

K72LW93A 

5.6 

± 1.4 

K85E 

1.5 

±0.42 

K85L 

0.76 

±0.16 

K91E 

0.88 

±0.20 

K91L 

2.4 

±0.28 

KllOE 

2.0 

±0.32 

KllOL 

2.0 

±0.12 

L82A 

1.8 

±0.38 

“All  measurements  were  performed  in  0.1  M sodium  formate  pH  3.5. 
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critical  for  a hydrogen  bonding  or  electrostatic  interaction.  In  fact  due  to  the  length  of  the 
side  chain,  the  lysine  at  position  72  might  sterically  clash  with  pepsin  or  with  part  of  the 
inhibitor.  During  the  mutagenesis  for  the  K72E  form  of  PI-3,  the  codon  for  V93  was 
incidentally  mutated  to  an  alanine  codon.  The  loss  of  the  (3-branched  hydrocarbon  might 
have  contributed  to  the  increase  in  pepsin  binding  affinity  if  the  valine  residue  was  too 
bulky  for  optimal  association  with  pepsin.  Converting  the  amino  acid  to  the  hydrophobic 
residue  leucine  was  designed  in  part  to  reflect  the  loss  of  a charge  group.  For  the  K72L 
mutation,  the  result  was  more  than  a 4-fold  weaker  association  to  pepsin.  The  ten-fold 
difference  in  pepsin  binding  affinity  between  the  glutamate  and  leucine  mutants  suggests 
that  the  residue  is  important  for  hydrogen  bonding,  perhaps  to  another  segment  of  the 
inhibitor  that  interacts  with  pepsin  or  to  stabilize  the  reactive  loop  through  hydrogen 
bonding  to  solvent.  An  analogous  assumption  may  be  made  about  the  differences 
observed  between  pepsin  affinity  to  the  two  forms  of  PI-3  mutated  at  K9 1 ; however,  the 
difference  is  less  extreme  (3 -fold)  and  implies  a lesser  importance  to  K91  than  to  K72  to 
stabilize  the  binding.  The  mutations  to  the  lysine  residues  at  positions  85  and  1 1 0 did  not 
result  in  proteins  with  significant  changes  to  the  binding  affinity  to  pepsin.  These  results 
suggest  that  none  of  these  four  lysine  residues  (K72,  K85,  K91,  and  K1 10)  is  a critical 
binding  residue  to  target  enzymes,  though  the  lysine  at  position  72  appears  to  form 
internal  or  weak  external  interactions  that  stabilize  the  inhibitor-enzyme  complex. 

Circular  dichroism  analyses.  These  mutant  proteins  were  very  similar  in  structure 
as  the  wild-type  rPI-3  based  on  CD  analysis  (Figure  5-4).  Circular  dichroism 
spectropolarimetry  was  used  to  compare  the  spectra  for  the  different  mutants,  thus 
providing  a qualitative  assessment  for  the  similarity  of  secondary  structures  of  these 
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Figure  5-4.  Circular  Dichroism  Spectra  of  rPI-3  Mutants 
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proteins.  The  expectation  was  that  these  proteins  would  not  differ  in  the  CD  spectra  due 
to  little  or  no  difference  in  the  protein  function.  The  measurements  were  performed  on  a 
JASCO  J-500c  spectropolarimeter,  and  the  ellipticity  data  were  converted  to  molar 
ellipticity  according  to  equation  8 in  chapter  2,  using  protein  concentrations  determined 
by  quantitative  amino  acid  analysis.  The  same  pattern  was  observed  for  the  different 
spectra  for  the  proteins.  The  minimum  at  208  nm  was  maintained  for  all  mutant  proteins, 
and  the  inflection  at  224  nm  was  also  consistent  for  all  proteins.  One  exception  was  the 
greater  positive  ellipticity  below  200  nm  for  the  K72E  mutant  than  for  the  other  forms  of 
mutant  PI-3.  Another  exception  was  the  inflection  at  215  nm  not  being  as  pronounced  in 
the  spectra  of  the  mutant  proteins  as  was  observed  for  the  wild-type  rPI-3. 

Due  to  the  general  difference  at  2 1 5 nm  between  the  wild-type  and  these  mutant 
inhibitors,  a secondary  structure  prediction  was  made  from  the  ellipticity  data  from  one 
representative  mutant  rPI-3.  The  K2D  method  was  again  performed  for  the  ellipticity 
data  from  the  K85L  mutant.  The  prediction  was  for  29%  a-helix  and  18%  P-strand, 
which  was  similar  to  that  for  the  wild-type  inhibitor,  25%/22%.  The  predictions  were 
consistent  showing  a lower  percentage  of  P-strand  agreeing  with  the  lack  of  an  ellipticity 
inflection  at  215  nm,  an  indicator  for  P-strands. 

Aspartic  proteinase  inhibition.  The  first  four  developed  mutant  proteins,  K72E, 
K72L,  K85E,  and  K85L,  were  also  analyzed  for  the  ability  to  inhibit  other  aspartic 
proteinases.  The  trends  for  binding  affinity  from  one  mutant  to  the  next  were  similar 
(Table  5-4).  The  affinity  of  rPI-3  to  proteinases  decreased  consistently  in  the  order  of 
porcine  pepsin,  human  cathepsin  E,  plasmepsin  II,  and  human  cathepsin  D.  In  each  case, 
the  loss  of  binding  affinity  was  approximately  1 0-fold.  The  different  mutant  inhibitors 
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provided  some  measurable  differences.  The  K72L  and  K85E  proteins  formed  tighter 
associations  with  pepsin  than  did  the  other  forms  of  PI-3.  The  K72EA^93A  mutant  was 
shown  to  have  a weaker  association  to  pepsin  than  other  mutant  PI-3  proteins.  The 
associations  of  cathepsin  E and  three  of  the  mutant  proteins  (K72L,  K85L,  K85E)  were 
nearly  identical  to  the  reported  value  for  the  association  of  native  PI-3  and  cathepsin  E, 
unlike  the  weaker  associations  of  the  wild-type  PI-3  and  the  K72EA^93  A mutant  with 
cathepsin  E.  Plasmepsin  II  was  inhibited  moderately  well  (110-430  nM)  by  all  five  of 
these  inhibitors.  This  result  was  interesting  because  plasmepsin  II  has  the  greatest 
sequence  similarity  to  cathepsin  D,  which  was  poorly  inhibited  by  all  five  forms  of  PI-3. 
Plasmepsin  II  formed  measurably  stronger  associations  with  PI-3  than  did  cathepsin  D, 
which  was  not  initially  predicted  by  the  sequence  similarities. 

Table  5-4  shows  that  cathepsin  D formed  a weak  association  with  PI-3.  Unlike  the 

measurements  for  affinity  of  the  other  proteinases  with  rPI-3 , the  assays  with  cathepsin  D 

were  performed  near  the  maximum  concentration  of  the  inhibitor.  At  the  extreme  limit  of 

inhibitor  concentration,  cathepsin  D was  still  60-70%  active.  The  data  could  not  be  fit 

simultaneously  to  a competitive  inhibition  model  without  high  error.  The  estimated  Kj 

values  for  cathepsin  D inhibition  and  PI-3  were  determined  by  the  mean  of  5-6  apparent 

Ki  values  calculated  by  the  Henderson  equation,  equation  3 in  chapter  2,  using  different 

inhibitor  concentrations.  The  error  in  these  determinations  remains  high;  therefore, 

differences  in  the  affinity  of  PI-3  for  cathepsin  D are  less  significant  than  differences  in 

data  for  the  other  dissociation  constants.  The  results  consistently  showed  that  all  of  the 
• • • 

inhibitor  mutants  purified  form  1 0 weaker  binding  associations  to  cathepsin  D than  to 


pepsin. 
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Lending  further  support  that  the  major  interactions  between  PI-3  and  enzymes 
occurs  at  the  prime  side  of  the  active  site  was  the  determination  of  the  binding  efficiency 
of  PI-3  to  a double-mutant  form  of  plasmepsin  II.  The  two  residues  in  the  non-prime 
active  site  subsites  S3  and  S2  of  plasmepsin  II  were  mutated  from  Metl3  and  Ile287  to 
glutamic  acid  residues.  The  mutant  plasmepsin  II  had  been  designed  to  study  the 
substrate  specificity  differences  between  pepsin  and  plasmepsin  II,  and  the  mutant 
enzyme  was  observed  to  prefer  substrates  more  like  pepsin  (Westling  et  al.,  1999).  If  the 
critical  interaction  occurred  at  hydrogen  bonding  side  chains  of  enzyme  residues,  two 
pepsin  glutamate  residues  in  the  non-prime  side  of  the  active  site  would  be  candidates  for 
the  specificity  of  inhibition.  The  binding  efficiency  of  the  mutant  plasmepsin  II  and  PI-3 
was  determined  to  be  3 1 0 nM,  which  was  indicative  of  a similar  or  even  weaker  affinity 
than  observed  for  the  wild-type  plasmepsin  II  and  PI-3.  These  data  support  the  model  of 
the  inhibitor  occluding  the  prime  side  of  the  active  site,  though  it  does  not  rule  out  the 
possibility  of  some  other  interactions  in  or  near  the  non-prime  side  of  the  active  site. 

A comparison  of  the  PI-3  binding  efficiency  to  pepsin,  the  fungal  aspartic 
proteinase  rhizopuspepsin,  and  a chimera  of  these  two  proteinases  was  performed  during 
the  characterization  of  the  chimera  protein.  The  development  of  chimera  proteins, 
genetically  combining  one  domain  of  one  protein  and  the  other  domain  from  another 
protein,  has  met  with  limited  success  (Bhatt,  1998).  One  successfully  examined  case 
involved  combining  the  amino-terminal  domain  of  pepsin  and  the  carboxyl-terminal 
domain  of  rhizopuspepsin.  The  dissociation  constant  for  inhibition  of  the  chimera  with 
rPI-3  was  1 1 0 nM.  No  inhibition  of  rhizopuspepsin  by  PI-3  could  be  measured  in  the 
assay.  Therefore,  the  contribution  of  the  amino-terminal,  “pepsin,”  domain  provided 


124 


sufficient  recognition  sites  for  the  moderate  inhibition  measured.  Not  only  must  the 
amino-terminal  domains  have  sufficient  recognition  sites,  but  also  the  carboxyl-terminal 
domains  must  contribute  to  the  magnitude  of  the  interaction,  as  the  affinity  of  the  chimera 
and  PI-3  was  1 00-fold  weaker  than  that  for  PI-3  and  pepsin.  Since  the  opposite 
orientation  of  the  chimera  has  not  been  completely  characterized,  the  determination  of 
whether  the  carboxyl-terminal  domain  of  pepsin  has  sufficient  recognition  sites  for  PI-3 
interaction  can  not  be  empirically  answered.  These  results  concur  with  the  limited 
diffraction  data  for  the  complete  complex,  in  which  the  two  domains  of  pepsin  were 
shifted  away  from  one  another.  Such  a movement  of  the  domains  can  be  explained  by 
the  physical  interaction  of  PI-3  to  both  domains  in  an  induced  fit  model. 

Discussion 

New  insights  to  the  mechanism  of  inhibition  have  been  discovered  from  this  study; 
however,  a complete  mechanism  remains  to  be  defined.  During  the  process  of  inhibition, 
rPI-3  is  not  covalently  bonding  to  pepsin.  Pepsin  does  not  hydrolyze  PI-3  as  part  of  the 
mechanism  either,  which  suggests  that  rPI-3  does  not  extend  through  the  catalytic 
subsites  S 1 and  S 1’  of  the  protein  as  a substrate.  This  interpretation  agrees  with  the 
limited  information  obtained  from  x-ray  diffraction  studies  of  the  pepsin:PI-3  complex. 
Electron  density  of  PI-3  has  been  detected  in  only  the  prime  side  of  the  active  site  of 
pepsin.  The  electron  density  data  also  predict  that  the  presence  of  the  inhibitor  induces  a 
slight  conformational  change  of  pepsin.  The  two  domains  of  pepsin  appeared  to  be 
separated  further  from  one  another  than  observed  in  other  solved  structures  of  pepsin. 
From  the  x-ray  data,  some  amino  acids  were  implicated  in  the  interaction  between  pepsin 
and  PI-3,  though  the  electron  densities  of  amino  acids  of  PI-3  had  been  difficult  to  assign 
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due  to  poor  phase  resolution  of  the  x-ray  diffraction  data.  The  affinity  of  PI-3  for 
rhizopuspepsin  showed  that  the  pepsin  amino-terminus  is  sufficient  for  making  the 
association  with  PI-3,  though  at  a weaker  affinity,  but  it  is  uncertain  if  the  pepsin  N- 
terminus  is  absolutely  necessary.  The  same  data  also  implied  that  the  pepsin  carboxyl 
terminus  is  necessary  for  the  strong  association  of  PI-3;  however,  it  is  unclear  if  the  C- 
terminus  of  pepsin  is  sufficient  for  the  association.  The  model  of  association  therefore 
proposes  that  the  amino-terminus  of  pepsin  is  sufficient,  and  both  domains  appear  to  be 
required  for  the  strength  of  the  association. 

The  difference  in  binding  efficiency  for  the  four  aspartic  proteinases  decreased  by 
10-fold  increments  from  pepsin  to  cathespin  E to  plasmepsin  II  to  cathepsin  D.  The 
relation  between  association  constants  and  free  energy,  AG  = -RT  In  (Kj),  predicts  that  at 
37°C  each  10-fold  loss  in  binding  affinity  is  equivalent  to  a 1 .4  kcal/mol  loss  in  energy 
stabilization  of  the  proteinase-inhibitor  complex.  The  free  energy  of  stabilization  for 
heterodimeric  interactions  has  been  observed  to  be  principally  the  result  of  both 
hydrophobic  packing  and  hydrogen  bond  formation  (Jones  and  Thornton,  1 996). 
Hydrophobic  packing  has  been  measured  by  various  means  to  approximate  0.5 -2.0 
kcal/mol  contribution  for  each  hydrophobic  side  chain  buried,  the  magnitude  depending 
on  the  size  of  the  hydrocarbon  chain  and  the  van  der  Waals’  fit  within  a hydrophobic 
pocket  (Andrews  et  al.,  1984).  Hydrogen  bonding  groups  have  been  predicted  to 
contribute  more  than  1 kcal/mol  to  the  association  of  two  proteins,  based  on  the 
difference  in  the  energy  gain  by  the  weak  electrostatic  interaction  and  the  loss  of 
hydrogen  bonding  to  solvent  by  the  amino  acid  side  chains.  Empirical  analyses  of 
mutations  to  reactive  site  residues  have  also  shown  that  a critical  residue  can  contribute 
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10-10  increase  to  the  strength  of  the  association  of  two  proteins  (Castro  and  Anderson, 

1 996).  Less  critical,  but  still  important,  interactions  can  improve  the  association  strength 
of  proteins  5- 100-fold  (Huang  et  al.,  1997).  These  improvements  in  affinity  correlate  to 
gains  in  the  free  energy  of  4-6  kcal/mol  for  a critical  residue  and  0.4-2. 8 kcal/mol  for 
contributions  from  other  amino  acid  contacts  between  two  proteins. 

The  physical  relevance  of  a point  mutation  on  protein  interactions  can  be  examined 
by  comparing  the  dissociation  constants  of  the  mutated  and  non-mutated  protein  to  the 
associated  protein.  When  considering  the  implications  of  the  effects  of  differences  in  the 
free  energy  of  associations,  differences  in  the  dissociation  constants  less  than  5-fold  are 
probably  not  due  to  an  important  protein-protein  contact  site.  In  fact,  direct  interactions 
from  the  inhibitor  reactive  site  to  the  enzyme  will  likely  require  greater  than  a 7-fold 
difference  in  dissociation  constants  between  a mutant  and  non-mutated  protein  to 
represent  a change  in  the  hydrophobic  packing.  An  increase  in  dissociation  constants 
greater  than  1 0-fold  is  likely  required  for  the  loss  of  a hydrogen  bonding  amino  acid. 

These  general  guidelines  can  be  applied  to  the  differences  in  the  dissociation 
constants  for  PI-3  point  mutants  and  pepsin.  The  ten-fold  difference  in  pepsin  binding 
affinity  between  the  protein  with  glutamate  and  leucine  mutations  at  residue  72  suggests 
that  the  residue  is  important  for  hydrogen  bonding,  perhaps  to  another  segment  of  the 
inhibitor  that  interacts  with  pepsin  or  to  stabilize  the  reactive  loop  through  hydrogen 
bonding  to  solvent.  The  2.5-fold  difference  in  pepsin  binding  affinity  between  the  K72 
and  K72EW93A  proteins  suggests  that  the  wild-type  lysine  residue  does  not  form  a 
critical  hydrogen  bond  or  salt  bridge  to  the  target  enzyme.  Additionally,  the  other 
proteins  with  lysine  mutations  at  positions  85,  91,  and  110  did  not  appear  to  be  important 
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to  the  binding  association  of  PI-3  to  pepsin.  Though  the  2-  and  3 -fold  differences  in 
dissociation  constants  between  the  K85E  and  K85L  and  the  K91E  and  K91L  proteins 
were  significant,  these  differences  less  than  5-fold  do  not  represent  changes  in  hydrogen 
bonding  or  hydrophobic  packing  between  the  inhibitor  and  enzyme.  The  L82A  mutation 
was  designed  based  on  the  peptide  fragment  of  PI-3  from  the  crystallographic  study,  and 
the  residue  was  postulated  to  be  in  contact  with  the  active  site  of  pepsin.  The  loss  of  the 
larger,  branched  chain  of  the  leucine  did  not  alter  the  affinity  significantly.  Therefore  the 
L82  side  chain  is  not  believed  to  be  important  to  the  association  of  PI-3  to  pepsin; 
however,  the  result  did  not  rule  out  the  possibility  that  the  backbone  amide  or  carbonyl 
oxygen  might  contact  the  enzyme.  These  mutagenesis  results  did  not  isolate  a precise 
residue  of  PI-3  that  provides  an  important  influence  to  the  interaction  with  proteinases. 

The  peptides  designed  by  the  collaborating  team  at  the  University  of  Alberta  were 
not  efficient  inhibitors;  only  the  disulfide-linked  peptide  lacking  a C-terminal  amide 
inhibited  pepsin  at  a weak  affinity  of  0.17  mM  and  0.35  mM  at  pH  2.0  and  3.5, 
respectively.  These  results  suggest  that  the  carboxylate  group  on  the  leucine  and  the 
disulfide  bonding  cysteine  contribute  to  the  interaction.  The  weak  affinity  of  the  peptide 
to  pepsin  and  the  only  reported  occurrence  of  this  fragmentation  of  PI-3  being  due  to  the 
crystallization  process  does  not  guarantee  that  the  peptide  corresponds  to  the  important 
reacting  domain  of  the  full-length  PI-3.  This  fragmentation  could  be  an  artifact  of  the 
crystallization  process;  however,  the  measurable  affinity  to  pepsin  suggests  that  this 
peptide  is  sufficient  to  bind  to  pepsin  and  is  likely  one,  if  not  the  principal,  interacting 
segment  of  PI-3 . This  peptide  has  a 1 0^  weaker  association  to  pepsin  than  the  full-length 
PI-3,  which  suggests  an  additional  2-5  more  amino  acid  contacts  between  the  two 


128 


proteins  are  likely  required  to  strengthen  the  association  to  the  degree  of  the  full-length 
PI-3.  In  this  and  other  laboratories,  other  peptides  of  varying  lengths  and  compositions 
based  on  the  sequence  of  PI-3  have  been  synthesized  and  have  yielded  no  apparent 
binding  to  pepsin.  Therefore,  larger  segments  of  the  PI-3  sequence  appear  to  be  required 
for  structural  stability  of  the  reactive  site(s),  though  the  exact  size  is  uncertain. 

The  model  prediction  of  Pl-3  interacting  in  and  near  the  prime  side  of  the  proteinase 
active  site  may  offer  predictions  about  residues  of  the  proteinases  that  contribute  to  the 
mode  of  inhibition.  Looking  at  a sequence  alignment  of  the  four  aspartic  proteinases 
porcine  pepsin,  plasmepsin  II,  cathepsin  D,  and  rhizopuspepsin  may  develop  some 
remarkable  differences  in  the  sequences  to  reflect  the  decrease  in  inhibitor  binding 
affinity  to  these  proteinases.  The  structures  for  these  four  proteinases  have  been  solved 
and  are  available  through  the  Protein  Data  Bank.  These  proteins  have  quite  similar  folds 
that  correlate  closely  with  the  alignment  of  the  sequences.  Therefore,  sequences  of  the 
proteinases  were  compared  to  the  tertiary  structure  of  these  proteins  to  identify  regions 
located  on  the  surface  or  in  the  active  site  on  the  prime  side  of  the  protein.  This  step 
removed  nearly  75%  of  the  amino  acid  residues  from  the  sequence  comparison.  The 
sequence  alignment  of  these  prime  side  surface  regions  is  shown  in  Figure  5-5.  Of  these 
sites,  the  following  showed  clear  differences  in  hydrogen  bonding  potential  between 
these  four  proteinases,  70,  86,  96-98, 127,  134,  and  187.  Three  of  these  sites  have 
strikingly  different  amino  acids  and  are  candidates  for  hydrogen  bonding  with  PI-3, 
positions  70,  86,  and  1 87.  Additional  amino  acids  show  variation  between  pepsin  and 
one  or  two  of  the  other  proteinases,  such  as  at  amino  acids  144-148,  that  might  contribute 
to  the  specificity  not  from  just  one  residue,  but  from  multiple  residues.  Any  of  the 
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residues  in  a local  site  might  change  the  local  surface  charge  interactions,  but  only  a few 
specific  residues  may  be  directly  associating  with  PI-3 . 

The  approximate  scale  of  ten-fold  decreases  in  associations  between  PI-3  and  the 
enzymes  pepsin,  cathepsin  E,  plasmepsin  II,  and  rhizopuspepsin  and  cathepsin  D 
correlate  with  the  increasing  isoelectric  points  for  these  four  enzymes.  Porcine  pepsin 
has  the  lowest  pi  of  3.2,  and  the  human  isoform  of  pepsin  has  a pi  of  3.3,  which  has  been 
shown  to  form  only  slightly  weaker  associations  with  the  native  PI-3  than  porcine  pepsin 
(Abu-Erreish  and  Peanasky,  1 974a).  Cathepsin  E and  plasmepsin  II  have  isoelectric 
points  of  4.1  and  4.6  and  associate  with  PI-3  nearly  10-fold  and  100-fold  weaker  than 
pepsin  and  PI-3,  respectively.  Rhizopuspepsin  follows  the  correlation  the  least  with  a pi 
of  4.6,  but  it  has  no  measurable  association  with  PI-3.  Finally  cathepsin  D has  a pi  of  5.6 
and  is,  at  best,  weakly  associating  with  PI-3.  The  higher  the  isoelectric  point  means  that 
the  ratio  of  basic  to  acidic  amino  acid  residues  is  greater,  and  this  implies  that  the  surface 
of  cathepsin  D will  have  more  positively  charged  residue  side  chains  than  the  other  four 
enzymes.  While  the  isoelectric  points  merely  reflect  the  composition  of  amino  acids  to 
these  enzymes,  the  general  correlation  of  increasing  pi  to  weaker  association  with  the 
inhibitor  may  reflect  surface  differences  in  charge  groups  that  could  contribute  to  the 
differences  in  affinities. 

The  differences  in  surface  charge  can  be  observed  by  viewing  the  tertiary  structures 
of  the  four  enzymes:  porcine  pepsin,  plasmepsin  II,  cathepsin  D,  and  rhizopuspepsin. 
Figure  5-6  shows  the  four  enzymes  looking  at  the  prime  side  of  the  active  site  with  the 
amino  terminal  domain  on  the  right  and  the  carboxyl  terminal  domain  on  the  left.  The 
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Figure  5-6.  Model  Structures  of  Porcine  Pepsin,  Plasmepsin  II,  Cathepsin  D,  and 
Rhizopuspepsin  Indicating  Surface  Charge  Distribution 

The  prime  side  of  the  active  site  of  these  four  enzymes  is  shown  with  the  amino-terminal 
domain  to  the  right  and  the  carboxyl-terminal  domain  to  the  left.  The  active  site  is  in  the 
center  and  bottom  of  each  structure.  The  GRASP  program  was  used  to  define  the  charge 
density  of  these  protein  surfaces.  Red  areas  represent  negative  charge  and  blue  areas 
represent  positive  charges. 
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active  site  is  at  the  eenter  and  bottom  of  the  structures.  Alexander  Wlodawer  converted 
these  figures  to  surface  charge  space-filled  models  using  the  program  GRASP  (Nicholls 
et  al.,  1991).  The  red  areas  represent  oxygen  atoms  on  the  protein  surface,  and  the 
surface  of  pepsin  appears  to  have  only  these  negatively  charged  regions  or  non-charged 
surface  regions  on  this  face  of  the  enzyme.  The  other  three  protein  models  show  a 
mixture  of  the  negatively  charged  and  positively  charged  (blue)  residues  on  the  surface  of 
this  face  of  the  enzymes.  Two  general  sites  in  particular  have  quite  different  charges 
between  pepsin  and  these  other  proteinases:  one  in  the  N-terminal  lobe  (right)  where  one 
or  two  positively  charged  residues  are  surface  exposed  for  the  non-pepsin  enzymes,  and 
one  in  the  C-terminal  lobe  where  another  positively  charged  residue  is  exposed.  The 
residues  corresponding  to  these  sites  are  87, 1 87,  and  297  in  the  pepsin  numbering,  and 
these  were  also  observed  from  the  sequence  alignment  in  Figure  5-5  as  possessing 
different  residues  between  pepsin  and  the  other  proteinases.  One  or  more  of  these 
enzyme  surface  areas  are  likely  candidates  that  reduce  the  binding  affinity  of  Pl-3  to 
plasmepsin  II,  cathepsin  D,  and  rhizopuspepsin  compared  to  pepsin. 


CHAPTER  6 

CONCLUSIONS  AND  FUTURE  DIRECTIONS 


Proteinase  inhibitors  may  have  arisen  biologically  due  to  the  need  for  specific 
regulation  of  proteinases,  as  in  the  case  of  the  intramolecular  inhibitor,  the  zymogen  pro- 
region. Natural  inhibitors  of  proteinases  appear  to  have  been  evolutionarily  coupled  with 
the  emergence  of  proteinases.  Between  species,  the  presence  of  inhibitors  may  have  been 
a response  to  proteolytic  challenges.  Some  of  these  intermolecular  inhibitors  arose  as  a 
method  to  attack  and  locally  disable  proteolytic  enzymes  and  others  developed  as  natural 
defenses  of  an  organism  toward  the  proteolytic  enzymes  of  another  organism. 

The  model  for  inhibition  of  pepsin  by  PI-3  is  closer  to  that  seen  for  other  parasites 
than  for  natural  regulatory  inhibitors.  X-ray  studies  of  other  parasite  inhibitors 
complexed  with  proteinases  have  often  revealed  that  these  inhibitors  bind  primarily  to 
one  side  of  the  active  site,  not  through  the  active  site  of  target  proteinases.  In  fact  some 
have  been  shovm  to  bind  to  the  active  site  and  to  regions  external  to  the  active  site  to 
enhance  the  affinity  (e.g.  hirudin)  or  for  the  majority  of  the  association  binding  energy 
(e.g.  triabin).  The  data  compiled  here  suggest  that  PI-3  will  bind  only  to  one  side  of  the 
active  site,  not  through  the  active  site.  The  inhibitor  is  neither  hydrolyzed  by  the  target 
enzyme  nor  does  it  form  a covalent  interaction  to  the  enzyme.  The  data  suggest  that  the 
interaction  occurs  on  the  prime  side  of  the  active  site  of  the  enzyme  in  physical  contact 
with  both  domains  of  the  proteinase.  PI-3  has  the  strongest  affinity  to  pepsin  of  all 
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aspartic  proteinases  examined.  The  inhibitor  likely  contacts  the  proteinase  through  some 
mixture  of  hydrophobic  and  hydrogen  bonding  interactions.  Part  or  all  of  the  target 
specificity  appears  to  derive  from  surface  negative  charges  on  pepsin  that  are  absent  in 
plasmepsin  II,  cathepsin  D,  and  rhizopuspepsin,  and  implying  that  hydrogen  bonding  or 
salt  bridge  formation  is  required  for  the  affinity. 

The  early  results  of  the  development  of  a recombinant  expression  system  that  was 
sufficient  for  the  production  of  PI-3  that  maintained  the  same  properties  as  the  native 
protein  provided  the  opportunity  to  study  the  protein  structure  and  function.  The  use  of  a 
selenomethionine  expression  system  was  developed  in  order  to  grow  crystals  of  a 
pepsin:PI-3  complex  that  included  an  internal  heavy  atom  derivative.  The  expression 
was  slower  and  yielded  far  less  protein  than  the  original  expression  method.  The  time 
required  to  perform  the  selenomethionine  expressions  and  to  improve  the  yield  were  too 
great  to  accept  this  as  an  important  focus.  Instead  the  collaborating  team  accepted  a gift 
of  the  bacterial  culture  that  had  been  transformed  and  the  protocol  originally  used  in 
order  to  continue  pursuing  that  expression  and  purification  problem. 

The  structural  studies  were  also  attempted  through  NMR  studies  of  PI-3 . This  work 
was  originally  assigned  to  another  graduate  student  who  left  the  university  after  just 
beginning  experiments  on  the  NMR  spectrometer.  Samples  of  PI-3  were  also  supplied  to 
Art  Edison  who  kindly  made  measurements  with  the  NMR  spectrometer,  but  due  to  a 
problem  with  aggregation  in  the  NMR  tube  and  more  pressing  responsibilities,  these 

t 

studies  were  halted.  Solution  studies  of  small,  proteinaceous  inhibitors  have  been 
performed  to  study  not  only  the  tertiary  structure  of  the  protein,  but  these  can  be 
expanded  to  study  the  forces  that  contribute  to  the  protein  conformational  stability 
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through  dynamic  hydrogen  exchange  experiments.  Currently  without  more  information 
about  the  structure  of  PI-3,  a solution  structure  determined  by  NMR  spectrometry  is 
another  experimental  challenge  that  can  be  pursued. 

Further  analyses  on  the  contributions  of  the  cysteine  residues  to  the  conformational 
stability  could  be  performed  using  mutations  to  PI-3  at  the  cysteine  residues.  Such 
analyses  can  provide  greater  detail  into  the  folding  and  unfolding  of  PI-3  and  to  what 
degree  the  different  cysteine  residues  contribute  to  the  protein  stability.  The  protein 
unfolding  and  folding  experiments  can  be  studied  further  by  fluorescence  measured 
changes  as  in  this  analysis  or  by  circular  dichroism  spectropolarimetry  using  urea  or 
guanidinium  hydrochloride  as  protein  denaturants.  Precise  temperature  control  using  CD, 
fluorescence  spectrometry,  or  another  technique,  differential  scanning  calorimetry,  can 
provide  even  more  detailed  thermodynamic  analyses  of  the  enthalpic  and  heat  capacity 
change  contributions  to  more  precisely  define  the  thermodynamic  components 
contributing  to  the  Gibbs’  free  energy  of  protein  stability.  Another  question  that  cysteine 
mutations  can  pursue  is  whether  or  not  all  of  the  cysteine  residues  are  required  for  protein 
function. 

A structure  of  the  inhibitor  and  of  the  complex  of  PI-3  and  pepsin  are  still  aims  of 
this  research.  Now  that  a number  of  mutations  have  yielded  no  direct  connections 
between  these  proteins,  a structure,  even  of  the  inhibitor  alone,  could  provide  a reliable 
source  for  predicting  contact  sites.  A structure  of  the  complex  would  directly  identify 
those  sites  and  answer  many  of  the  questions  about  the  mechanism  and  roles  of  specific 
amino  acids.  Even  without  a known  structure,  the  experiments  and  results  described 
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herein  have  provided  many  more  questions  and  potential  experiments  that  can  be 
explored  in  the  future. 

PI-3  mutations,  studies  of  which  to  identify  critical  contact  sites  have  been  fhiitless 
thus  far,  have  removed  5 residues  from  the  sequence  of  the  protein  as  potential  critical 
binding  residues.  Other  residues  of  importance  have  been  identified  from  some  of  the 
limited  crystal  data,  including  ArglOS  and  Cys79.  Not  all  of  the  lysine  residues  have 
been  modified  by  mutagenesis,  though  most  that  were  identified  by  Kageyama  have  been 
modified  and  shown  not  to  be  critical.  Lysine- 16  is  the  only  remaining  lysine  identified 
by  Kageyama  that  was  not  characterized.  Other  lysine  residues  could  be  modified  based 
on  that  work,  though  the  sites  he  had  identified  as  most  prevalently  modified  have  now 
been  shown  not  as  critical.  Another  possibility  for  addressing  the  hypothesis  that  positive 
charges  on  PI-3  interact  with  enzyme  carboxylate  groups  is  to  perform  an  alanine 
scanning  mutagenesis  to  the  remaining  lysine  and  the  arginine  residues  and  to  screen 
these  for  altered  binding  efficiency  to  pepsin. 

Another  approach  to  understand  the  specificity  of  PI-3  and  to  identify  types  of 
amino  acid  interactions  with  enzymes  is  to  study  the  binding  phenomenon  from  the  other 
view:  the  differences  in  proteinases  that  contribute  to  the  differences  in  binding  efficiency 
of  PI-3 . From  the  sequence  alignments  of  multiple  aspartic  proteinases  that  interact  with 
PI-3  to  different  degrees  and  the  GRASP  images  of  solved  proteinase  structures,  specific 
sites  can  be  targeted  on  the  proteinases  to  modify.  Specific  residues  of  pepsin  that  have 
been  identified  include  residues  Glu70,  Tyr86,  Thrl34,  Glul87,  and  Glu297  in  pepsin 
numbering.  Figure  6-1  is  a ribbon  model  of  pepsin  designed  to  show  all  of  the  aspartate 
and  glutamate  residues  on  the  surface  of  the  prime  side  of  the  protein  as  stick 
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representations.  While  most  of  these  negative  charged  residues  are  conserved  among  the 
aspartic  proteinases  aligned  in  Figure  5-5,  the  residues  at  positions  Glu70,  Asp96, 

Asp  142,  Asp  149,  Glul87,  and  Glu297  are  not  conserved  and  are  candidate  residues  to 
mutate  to  study  the  importance  to  the  inhibitor-enzyme  association.  These  residues  could 
be  mutated  to  alanine  to  remove  hydrogen  bonding  side  chains  or  to  lysine  residues  to 
introduce  a potentially  repelling  charge  in  order  to  analyze  the  effect  of  those  specific 
residues  on  the  interaction  of  pepsin  and  PI-3 . 

Finally  the  use  of  the  chimera  construct  of  pepsin-rhizopuspepsin  provided  an 
exciting  opportunity  to  explore  contributions  from  each  domain.  The  reverse  chimera 
purification  to  homogeneity  and  stabilization  for  characterization  could  be  used  to 
complete  this  analysis  of  the  domain  contributions  to  the  interactions  between  PI-3  and 
proteinases.  This  chimera  and  combinations  of  other  proteinase  domains  can  be 
measured  for  affinity  with  PI-3  and  used  to  explore  the  domain  surface  contributions  to 
the  mechanism  of  inhibition. 

Understanding  the  mechanisms  of  protein  inhibition  have  led  to  treatments  for 
diseases  in  the  past,  and  the  discovery  of  novel  mechanisms  of  inhibition  may  lead  to  new 
methods  to  treat  diseases.  The  analyses  of  proteinase  inhibition  described  herein  were 
performed  to  begin  the  discovery  of  the  mechanism  of  PI-3  inhibition.  The  continuation 
of  these  analyses  may  further  support  a novel  method  of  proteinase  inhibition  that  could 
be  used  for  engineering  novel  therapeutics  and  to  learn  more  about  the  structural 
properties  that  permit  PI-3  to  be  functional  in  the  nematode. 
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